
Google announced Gemini Nano 4 at Google I/O 2026, with a developer preview opening on April 2nd. As Mobile Team Lead at NineTwoThree, I spent time digging into both Gemma 4 and Gemini Nano 4 to understand what this actually means for the apps we build. My conclusion, after going through the documentation, the developer preview, and the device requirements: the technology is genuinely impressive, and the audience for it right now is very small.
That's not a dismissal. The edge AI story here is real, and the trajectory is worth paying attention to. But being honest about where things stand is more useful than getting carried away with specs.
Google previewed Gemini Nano 4 as the next generation of its on-device AI model for Android, built on the open Gemma 4 model family. The headline numbers: up to 4x faster and 60% more battery efficient than Nano 3, multimodal support across text, image, and audio, and native support for 140+ languages. On paper, that represents a real step forward.
Gemini Nano is the on-device tier of Google's Gemini model family for Android. Unlike cloud-based models, it runs entirely within Android's AICore system service, meaning no network calls, no per-request cost, and no data leaving the device. The model weights live on the phone and get updated through Google Play system updates.
Nano 4 comes in two variants, both built on the Gemma 4 open model family:
Both variants run inside AICore, which offloads inference to the device's dedicated AI accelerator, whether that's Google's Tensor TPU, a MediaTek Dimensity NPU, or a Qualcomm Snapdragon NPU. Apps don't bundle the model. They request it from AICore and interact with it through ML Kit GenAI APIs or the lower-level AI Edge SDK. For developers, the practical upside is that code written today against Gemma 4 in the AICore Developer Preview will run on Nano 4-enabled devices when they ship consumer hardware later in 2026.
This is where the story gets complicated.
To run Gemini Nano 4, a device needs at least 12GB of RAM, a flagship SoC with a supported AI accelerator, and Gemini Nano v3 or higher already on board. That combination, as of mid-2026, narrows the supported device list down to the Pixel 10 lineup, the Samsung Galaxy S26 series, and a small number of high-end phones from Oppo, OnePlus, and Xiaomi.
Phones that don't qualify include the Pixel 9 Pro, the Galaxy S25 Ultra, and the Samsung Galaxy Z Fold 7. These are current flagship devices. The Z Fold 7 has plenty of RAM. The Pixel 9 Pro runs Google's own Tensor G4, marketed as an AI-first chip. But because they run Nano v2 rather than Nano v3, they fall outside the requirement. As How-To Geek reported, Google's strict spec requirements rule out most existing phones, including some of its own current Pixel 9 lineup.
To make this concrete from inside NineTwoThree: our own CTO has been an Android user for over ten years and currently runs a Pixel 9 Pro. He cannot access the developer demo. That's a useful signal about real-world reach.
For offline AI features in production apps, the practical read is straightforward. If a Nano 4-only capability ships today, the addressable user base is essentially people who bought a Pixel 10 or Galaxy S26 and opted into the AICore developer preview. Estimates put Nano 4-compatible devices at roughly 1-3% of the active Android install base. Any feature requiring Nano 4 directly will miss 97-99% of users.
The edge AI modality story is one of the more genuinely interesting parts of this release. Nano 4 supports text input and output, image input with noticeably better OCR, chart understanding, and handwriting recognition than its predecessor, and audio input through the ML Kit Speech Recognition API, all running locally without a network connection.
Google positions Nano for a specific set of tasks through its ML Kit GenAI APIs: summarization, proofreading, tone rewriting, image description, voice-to-action conversion, and custom prompting via the Prompt API. These are bounded tasks that work well on-device precisely because they don't need live information and fit within the model's context window. Summarizing a long article or email thread the user has already downloaded, or running a grammar check on a draft before sending, are clear edge AI examples of this pattern.
Reasoning has also improved meaningfully in this generation. Gemma 4 handles chain-of-thought instructions, conditional logic, and basic math word problems at noticeably higher quality than Nano 3. For classifiers, content moderation, and structured data extraction, that improvement is practical.
Nano 4's on-device context window hasn't been officially confirmed by Google. Gemma 4's open models support up to 128K tokens on capable hardware, but the AICore implementation will likely land around 32-64K tokens given mobile memory and latency budgets. Long-context recall on Gemma 4 measures around 66% at 128K tokens on the RULER benchmark, so accuracy degrades for very long inputs.
More importantly, Nano 4 has no access to live data. It has training-time knowledge only. Any feature that needs current prices, recent news, or real-time information still requires a server call. Google explicitly steers Nano away from open-ended chat for this reason.
Features like tool calling, structured output, system prompts, and a thinking mode are listed as coming during the preview period. They're not available yet, and building a production roadmap around them before they officially ship is premature.
The interesting question isn't whether Nano 4 is good. It is. The question is whether it changes what we build today.
As an edge AI capability, Gemini Nano 4 belongs in one specific architectural pattern: progressive enhancement. The feature works on a small number of supported devices and falls back to a cloud call, a lighter on-device model, or no AI at all on everything else. Google's own developer guidance illustrates this directly. The developer checks FeatureStatus, runs the on-device path if available, and routes to a server-side alternative otherwise. The documentation frames the current period as a "head start on refining prompts and exploring use cases," not a production-ready deployment target.
The situations where this pattern makes the most sense are features where privacy, latency, or offline AI access genuinely create a better experience:
We covered the broader case for building AI that works without internet in our post on what we learned building offline AI for mobile. The architecture patterns described there apply directly here. We've also written about why mobile AI matters for your product's competitive position if you're working through the strategic case.
The use cases to defer for now: anything requiring consistent behavior across all Android users, anything that needs up-to-date information, and anything positioned as a general-purpose AI assistant. The hardware gap makes those an engineering liability, not an advantage.
One practical step worth taking now, before Nano 4 ships broadly to consumers: set up a test device on the AICore Developer Preview (Pixel 10 or Galaxy S26), run your core AI prompts against both Nano 4 variants, and build the abstraction layer that lets you swap between on-device and cloud inference without changing feature code. That preparation costs little and will matter when the hardware base broadens over the next 12-18 months.
For practical frameworks to think through AI feature decisions in your product, our free AI and machine learning resources are a good starting point.
Gemini Nano 4 is a technically sound step forward in edge AI for Android. The speed and efficiency gains are real. The multimodal capabilities are broader than the previous generation. The developer tooling through ML Kit and AICore is cleaner than it was a year ago.
The limitation isn't the model. It's the hardware it requires, and how narrow that makes the realistic audience in 2026. A current Google flagship from last year doesn't qualify. A premium Samsung device from six months ago doesn't either. Our own CTO can't run the demo on his Pixel 9 Pro.
That's not a reason to ignore it. It's a reason to build toward it carefully, with fallback patterns in place from day one, rather than treating Nano 4 as a baseline assumption. The on-device AI direction is right. The timeline for when it reaches a meaningful share of Android users is not this year.
If you're building a mobile product and want to think through where on-device AI fits in your architecture, reach out to NineTwoThree. Our mobile team navigates these decisions regularly and can help you determine what to build now versus what to plan for.
Google announced Gemini Nano 4 at Google I/O 2026, with a developer preview opening on April 2nd. As Mobile Team Lead at NineTwoThree, I spent time digging into both Gemma 4 and Gemini Nano 4 to understand what this actually means for the apps we build. My conclusion, after going through the documentation, the developer preview, and the device requirements: the technology is genuinely impressive, and the audience for it right now is very small.
That's not a dismissal. The edge AI story here is real, and the trajectory is worth paying attention to. But being honest about where things stand is more useful than getting carried away with specs.
Google previewed Gemini Nano 4 as the next generation of its on-device AI model for Android, built on the open Gemma 4 model family. The headline numbers: up to 4x faster and 60% more battery efficient than Nano 3, multimodal support across text, image, and audio, and native support for 140+ languages. On paper, that represents a real step forward.
Gemini Nano is the on-device tier of Google's Gemini model family for Android. Unlike cloud-based models, it runs entirely within Android's AICore system service, meaning no network calls, no per-request cost, and no data leaving the device. The model weights live on the phone and get updated through Google Play system updates.
Nano 4 comes in two variants, both built on the Gemma 4 open model family:
Both variants run inside AICore, which offloads inference to the device's dedicated AI accelerator, whether that's Google's Tensor TPU, a MediaTek Dimensity NPU, or a Qualcomm Snapdragon NPU. Apps don't bundle the model. They request it from AICore and interact with it through ML Kit GenAI APIs or the lower-level AI Edge SDK. For developers, the practical upside is that code written today against Gemma 4 in the AICore Developer Preview will run on Nano 4-enabled devices when they ship consumer hardware later in 2026.
This is where the story gets complicated.
To run Gemini Nano 4, a device needs at least 12GB of RAM, a flagship SoC with a supported AI accelerator, and Gemini Nano v3 or higher already on board. That combination, as of mid-2026, narrows the supported device list down to the Pixel 10 lineup, the Samsung Galaxy S26 series, and a small number of high-end phones from Oppo, OnePlus, and Xiaomi.
Phones that don't qualify include the Pixel 9 Pro, the Galaxy S25 Ultra, and the Samsung Galaxy Z Fold 7. These are current flagship devices. The Z Fold 7 has plenty of RAM. The Pixel 9 Pro runs Google's own Tensor G4, marketed as an AI-first chip. But because they run Nano v2 rather than Nano v3, they fall outside the requirement. As How-To Geek reported, Google's strict spec requirements rule out most existing phones, including some of its own current Pixel 9 lineup.
To make this concrete from inside NineTwoThree: our own CTO has been an Android user for over ten years and currently runs a Pixel 9 Pro. He cannot access the developer demo. That's a useful signal about real-world reach.
For offline AI features in production apps, the practical read is straightforward. If a Nano 4-only capability ships today, the addressable user base is essentially people who bought a Pixel 10 or Galaxy S26 and opted into the AICore developer preview. Estimates put Nano 4-compatible devices at roughly 1-3% of the active Android install base. Any feature requiring Nano 4 directly will miss 97-99% of users.
The edge AI modality story is one of the more genuinely interesting parts of this release. Nano 4 supports text input and output, image input with noticeably better OCR, chart understanding, and handwriting recognition than its predecessor, and audio input through the ML Kit Speech Recognition API, all running locally without a network connection.
Google positions Nano for a specific set of tasks through its ML Kit GenAI APIs: summarization, proofreading, tone rewriting, image description, voice-to-action conversion, and custom prompting via the Prompt API. These are bounded tasks that work well on-device precisely because they don't need live information and fit within the model's context window. Summarizing a long article or email thread the user has already downloaded, or running a grammar check on a draft before sending, are clear edge AI examples of this pattern.
Reasoning has also improved meaningfully in this generation. Gemma 4 handles chain-of-thought instructions, conditional logic, and basic math word problems at noticeably higher quality than Nano 3. For classifiers, content moderation, and structured data extraction, that improvement is practical.
Nano 4's on-device context window hasn't been officially confirmed by Google. Gemma 4's open models support up to 128K tokens on capable hardware, but the AICore implementation will likely land around 32-64K tokens given mobile memory and latency budgets. Long-context recall on Gemma 4 measures around 66% at 128K tokens on the RULER benchmark, so accuracy degrades for very long inputs.
More importantly, Nano 4 has no access to live data. It has training-time knowledge only. Any feature that needs current prices, recent news, or real-time information still requires a server call. Google explicitly steers Nano away from open-ended chat for this reason.
Features like tool calling, structured output, system prompts, and a thinking mode are listed as coming during the preview period. They're not available yet, and building a production roadmap around them before they officially ship is premature.
The interesting question isn't whether Nano 4 is good. It is. The question is whether it changes what we build today.
As an edge AI capability, Gemini Nano 4 belongs in one specific architectural pattern: progressive enhancement. The feature works on a small number of supported devices and falls back to a cloud call, a lighter on-device model, or no AI at all on everything else. Google's own developer guidance illustrates this directly. The developer checks FeatureStatus, runs the on-device path if available, and routes to a server-side alternative otherwise. The documentation frames the current period as a "head start on refining prompts and exploring use cases," not a production-ready deployment target.
The situations where this pattern makes the most sense are features where privacy, latency, or offline AI access genuinely create a better experience:
We covered the broader case for building AI that works without internet in our post on what we learned building offline AI for mobile. The architecture patterns described there apply directly here. We've also written about why mobile AI matters for your product's competitive position if you're working through the strategic case.
The use cases to defer for now: anything requiring consistent behavior across all Android users, anything that needs up-to-date information, and anything positioned as a general-purpose AI assistant. The hardware gap makes those an engineering liability, not an advantage.
One practical step worth taking now, before Nano 4 ships broadly to consumers: set up a test device on the AICore Developer Preview (Pixel 10 or Galaxy S26), run your core AI prompts against both Nano 4 variants, and build the abstraction layer that lets you swap between on-device and cloud inference without changing feature code. That preparation costs little and will matter when the hardware base broadens over the next 12-18 months.
For practical frameworks to think through AI feature decisions in your product, our free AI and machine learning resources are a good starting point.
Gemini Nano 4 is a technically sound step forward in edge AI for Android. The speed and efficiency gains are real. The multimodal capabilities are broader than the previous generation. The developer tooling through ML Kit and AICore is cleaner than it was a year ago.
The limitation isn't the model. It's the hardware it requires, and how narrow that makes the realistic audience in 2026. A current Google flagship from last year doesn't qualify. A premium Samsung device from six months ago doesn't either. Our own CTO can't run the demo on his Pixel 9 Pro.
That's not a reason to ignore it. It's a reason to build toward it carefully, with fallback patterns in place from day one, rather than treating Nano 4 as a baseline assumption. The on-device AI direction is right. The timeline for when it reaches a meaningful share of Android users is not this year.
If you're building a mobile product and want to think through where on-device AI fits in your architecture, reach out to NineTwoThree. Our mobile team navigates these decisions regularly and can help you determine what to build now versus what to plan for.
