LiteRT NeuroPilot Unlocks Phone NPUs: The Secret to 1600+ Tokens/sec On-Device LLMs

12 days ago 高效码农

Google LiteRT NeuroPilot: Making Phone NPUs “First-Class Citizens” for On-Device LLMs In the era of pursuing faster, more private AI experiences, running Large Language Models (LLMs) directly on devices is the critical next step. Yet, fitting models with billions of parameters into smartphones and running them smoothly has remained a significant challenge for developers. Recently, the LiteRT NeuroPilot Accelerator stack, launched by Google and MediaTek, aims to turn the NPUs (Neural Processing Units) in MediaTek’s Dimensity series chips into the “preferred target” for on-device LLMs. This is not just another technical update; it seeks to fundamentally change how developers interact …