Google Unveils Gemma 3n AI Model That Works Offline And On Budget Phones With 2GB Of RAM

Published: Saturday, June 28, 2025, 8:01 [IST]

Google has unveiled Gemma 3n, a cutting-edge AI model capable of operating on smartphones without internet access. This model supports advanced multimodal tasks using only 2GB of memory. Initially announced in May 2025, Gemma 3n is now fully launched, bringing sophisticated audio, image, video, and text processing capabilities to devices with limited memory.

At the core of Gemma 3n is the MatFormer architecture, which resembles Russian nesting dolls by incorporating smaller sub-models within larger ones. This design allows developers to adjust performance based on hardware availability. The model comes in two versions: E2B, requiring just 2GB of memory, and E4B, needing around 3GB.

Google Unveils Gemma 3n Lightweight AI Model

Innovative Features of Gemma 3n

Gemma 3n’s efficiency stems from innovations like Per-Layer Embeddings (PLE), which shift some processing from the phone’s graphics processor to its central processor. This approach conserves valuable memory while maintaining high performance. Despite having between 5 and 8 billion raw parameters, both versions function efficiently.

The introduction of KV Cache Sharing significantly enhances the speed at which the model processes lengthy audio and video inputs. Google claims this feature improves response times by up to twofold, making real-time applications such as voice assistants or video analysis more feasible on mobile devices.

Enhanced Speech and Visual Capabilities

For speech-based functionalities, Gemma 3n incorporates an audio encoder derived from Google’s Universal Speech Model. This enables tasks like speech-to-text and language translation directly on a smartphone. Early tests have shown particularly strong results when translating between English and European languages such as Spanish, French, Italian, and Portuguese.

The visual component of Gemma 3n is powered by MobileNet-V5, Google’s new lightweight vision encoder. It can manage video streams at up to 60 frames per second on devices like the Google Pixel. This capability ensures smooth real-time video analysis while outperforming previous models in both speed and accuracy.

Developer Access and Offline Functionality

Developers can utilise Gemma 3n through popular tools like Hugging Face Transformers, Ollama, MLX, llama.cpp, among others. Google has also introduced the “Gemma 3n Impact Challenge,” encouraging developers to create applications leveraging the model’s offline capabilities. A $150,000 prize pool will be shared among winners.

A significant advantage of Gemma 3n is its ability to function entirely offline without needing an internet connection. This feature makes it ideal for AI-powered applications in remote areas or privacy-sensitive environments where cloud-based models are unsuitable.

Supporting over 140 languages and understanding content in 35 languages, Gemma 3n sets a new benchmark for efficient on-device AI accessibility. Its launch marks a significant step forward in making advanced AI features available directly on low-power devices without relying on cloud infrastructure.

Best Mobiles in India