Google has unveiled Gemma 3n, a cutting-edge AI model capable of operating on smartphones without internet access. This model supports advanced multimodal tasks using only 2GB of memory. Initially announced in May 2025, Gemma 3n is now fully launched, bringing sophisticated audio, image, video, and text processing capabilities to devices with limited memory.
At the core of Gemma 3n is the MatFormer architecture, which resembles Russian nesting dolls by incorporating smaller sub-models within larger ones. This design allows developers to adjust performance based on hardware availability. The model comes in two versions: E2B, requiring just 2GB of memory, and E4B, needing around 3GB.

Innovative Features of Gemma 3n
Gemma 3n’s efficiency stems from innovations like Per-Layer Embeddings (PLE), which shift some processing from the phone’s graphics processor to its central processor. This approach conserves valuable memory while maintaining high performance. Despite having between 5 and 8 billion raw parameters, both versions function efficiently.
The introduction of KV Cache Sharing significantly enhances the speed at which the model processes lengthy audio and video inputs. Google claims this feature improves response times by up to twofold, making real-time applications such as voice assistants or video analysis more feasible on mobile devices.
Enhanced Speech and Visual Capabilities
For speech-based functionalities, Gemma 3n incorporates an audio encoder derived from Google’s Universal Speech Model. This enables tasks like speech-to-text and language translation directly on a smartphone. Early tests have shown particularly strong results when translating between English and European languages such as Spanish, French, Italian, and Portuguese.

The visual component of Gemma 3n is powered by MobileNet-V5, Google’s new lightweight vision encoder. It can manage video streams at up to 60 frames per second on devices like the Google Pixel. This capability ensures smooth real-time video analysis while outperforming previous models in both speed and accuracy.
Developer Access and Offline Functionality
Developers can utilise Gemma 3n through popular tools like Hugging Face Transformers, Ollama, MLX, llama.cpp, among others. Google has also introduced the “Gemma 3n Impact Challenge,” encouraging developers to create applications leveraging the model’s offline capabilities. A $150,000 prize pool will be shared among winners.
A significant advantage of Gemma 3n is its ability to function entirely offline without needing an internet connection. This feature makes it ideal for AI-powered applications in remote areas or privacy-sensitive environments where cloud-based models are unsuitable.
Supporting over 140 languages and understanding content in 35 languages, Gemma 3n sets a new benchmark for efficient on-device AI accessibility. Its launch marks a significant step forward in making advanced AI features available directly on low-power devices without relying on cloud infrastructure.
Best Mobiles in India
1,29,999
22,999
64,999
99,999
29,999
39,999
63,999
1,56,900
96,949
1,39,900
1,29,900
79,900
65,900
12,999
96,949
16,499
38,999
30,700
49,999
19,999
17,970
21,999
13,474
18,999
22,999
19,999
17,999
26,999
5,999
Story first
published: Saturday, June 28, 2025, 8:01 [IST]