
For the past two years, the AI story in the Apple ecosystem has followed the same pattern — a staggered Apple Intelligence rollout, a delayed Siri rebuild, "personal context" landing only partially across releases, and the EU getting almost none of it. Heading into WWDC 2026, the question was simple: is this finally the year Apple delivers, or just another chapter of delay?
From my perspective as an iOS engineer at NineTwoThree, here's the honest answer — for the consumer-facing Siri AI experience, the wait continues. But for on-device developer AI, this is genuinely the year the platform landed, and in a way that changes the economics of mobile AI features we'd previously written off as too expensive to ship.
Apple opened the Foundation Models framework to third-party apps. Every Apple Intelligence-capable iPhone now ships with a 3-billion-parameter on-device model — AFM 3 Core — accessible through a native Swift API. It handles writing, summarization, translation across 25 languages, content classification, and tool calls, running at roughly 30 tokens per second on an iPhone 15 Pro. No cloud bill, no API key, no quota.
For heavier tasks, Private Cloud Compute (PCC) is now developer-accessible — a cryptographically attested environment where Apple can't read the request, verified independently by security researchers. And here's the part that changes the calculation for us: for developers in the App Store Small Business Program with fewer than 2 million lifetime downloads, PCC is free. The per-token cloud cost that has historically killed mobile AI features isn't really a blocker anymore for most teams.
There's also a new unifying Swift protocol — LanguageModel — that lets the same session and API run against Apple's on-device model, PCC, MLX, Anthropic Claude, or Google Gemini. Switching providers becomes a one-line change rather than a rewrite.
For about three years the industry consensus has been simple: real AI runs in the cloud, and mobile clients are thin wrappers. That held because per-token cloud inference was expensive, model files were too big to ship to a phone, and on-device hardware couldn't run a usable language model. WWDC 2026 broke all three assumptions at once.
The net effect: a lot of features we rejected as "too expensive to operate" two years ago are now zero-bill, on-device features.
The single biggest commercial shift is the cost model. Most of the stack is now free for most teams. On-device models (AFM 3 Core and Core Advanced) carry no metering at all. PCC is free under the Small Business Program below 2M lifetime downloads. Custom on-device models built with Core AI cost nothing to run — the user downloads them once via Background Assets. The only metered costs are deliberate choices to route to Claude or Gemini for capabilities Apple's models don't yet match.
That reframes the conversation we have with clients. Features that used to come with an open-ended per-image or per-token operating bill — receipt parsing, photo-to-structured-data, message rewriting, content tagging — are now fixed-cost or free, and they run on the phone.
There's a second story that matters even more for the healthcare, legal, and finance clients we work with: the privacy posture is different now. For years the blocking question on any regulated build has been "can we send this data to a cloud LLM?" — and the honest answer was usually no.
On this stack, that question finally has a defensible answer. On-device inference means the data never leaves the phone. PCC means that when you do need the cloud, Apple can cryptographically prove — to the client and to independent researchers — that no one, including Apple, can read the request. And Core AI means a client's proprietary model (a medical imaging classifier, a fraud-detection model, a domain-specific LLM they own) can ship to a phone without the model or its data ever touching a third-party vendor. That path simply didn't exist cleanly before.
A few patterns are immediately viable, and I'd reach for them in this order:
This isn't unqualified good news, and it's worth being straight about the limits.
Device eligibility is the big one. Apple Intelligence — the gate on Foundation Models access — runs on roughly 65–70% of US iPhones today (iPhone 15 Pro and newer). The higher-capability AFM 3 Core Advanced model, which adds image input, reaches only ~25–30%. That leaves roughly a third of users on devices that get nothing — and many of them are on recent hardware like the iPhone 14 or the base iPhone 15, which is still on sale. Any feature you build needs a graceful degradation path.
The free tier has a ceiling. Cross 2 million lifetime downloads and you exit the free PCC tier. Apple hasn't published pricing above that line yet, so treat it as TBD in any planning conversation.
The EU gets the frameworks but not the assistant. Foundation Models, PCC, Core AI, and the LanguageModel protocol all work in the EU. What's restricted is the consumer Siri AI assistant layer, which Apple attributes to DMA compliance. The developer-facing stack is not regionally gated.
If you were waiting for Siri to become the AI assistant Apple has been promising, this was another year on hold. But if you build apps, WWDC 2026 was the window opening. The platform now hands you free on-device inference, a privacy-grade cloud tier, and a clean path to ship custom models — all behind native Swift APIs. The features we used to price out of existence are suddenly shippable. That's the part worth acting on now.
This is exactly the kind of shift we help teams turn into shipped product. If you're weighing what's now possible on-device — especially in a regulated industry — talk to our team about your AI roadmap.
For the engineers: here's what each piece of the stack does, where the limits are, and how the pieces compose. Scope is intentionally narrow — the AI inference surfaces available to a third-party iOS developer, on-device or in PCC.
Apple Foundation Models 3 Core is the on-device model bundled with iOS 27, accessible through the FoundationModels Swift framework. It's a true small LLM, not a feature-specific classifier, and it runs entirely on the device.
@Generable macro and @Guide field hints..refusal and .guardrailViolation errors are part of the contract; English-first at launch, with the other 24 languages rolling in over the cycle.The framework defaults to AFM 3 Core via SystemLanguageModel.default — swapping to PCC or a third-party provider is a one-line change to the model parameter.
The high-capability on-device variant is a 20-billion-parameter sparse Mixture-of-Experts model that activates just 1–4B parameters per token. Apple fits it on a phone by storing most of it in flash (NAND) rather than DRAM, using a lightweight dense block to select experts dynamically. Apple credits two techniques: Instruction-Following Pruning (IFP) for deployment beyond DRAM constraints, and Quantization Aware Training (QAT) to preserve accuracy at lower precision.
PCC is the cloud half of the stack — not a traditional cloud LLM, but a cryptographically attested compute environment where Apple proves no party (including Apple) can read the data. Announced for Apple's own use at WWDC 2024, it opened to third-party developers at WWDC 2026. It hosts three models: AFM 3 Cloud (the server-side workhorse), ADM 3 Cloud (Image) for image generation/editing, and AFM 3 Cloud Pro (the most capable, for agentic tool use and complex reasoning). Notably, AFM 3 Cloud Pro runs on Google Cloud infrastructure with NVIDIA GPUs — the first time PCC has extended beyond Apple silicon — with privacy guarantees maintained via the same attestation.
.light, .moderate, .deep), multi-step planning and agentic tool-calling, long-document summarization, multi-page OCR, and multi-image reasoning.model.quotaUsage.status before invoking, disable AI-bound buttons near the limit, and surface an iCloud+ upsell where appropriate. Xcode 27 can simulate availability and quota states.Core AI is a new framework (distinct from, and not replacing, the older Core ML) built for generative-era workloads: LLMs in the 10B+ range, image segmentation models, and custom domain models. Core ML stays for its legacy image-classifier and tabular cases.
The pipeline: convert a PyTorch model to Apple's .aimodel format via torch.export; compress with coreai-opt (int4 per-channel symmetric is the standard preset, with K-means palletization for embeddings and FP4/FP8 for sensitive layers); AOT-compile per target device with xcrun coreai-build; distribute via Background Assets, not the app bundle; and run through high-level wrappers like CoreAILanguageModel or CoreAIImageSegmenter.
The headline demo — SAM3: Meta's Segment Anything Model 3 ran on iPhone after compressing from ~3 GB to ~430 MB at int4 (about an 85% reduction) with no meaningful quality loss, plus a 76% speedup from cached image-encoder reuse via the multi-entrypoint asset feature.
API and lifecycle: three core types — AIModel, InferenceFunction, and NDArray — built on memory-safe, non-escapable Swift types. Specialization (per-device compile) is a formal lifecycle step that can be slow on first load, so trigger it ahead of time; AIModelCache manages cached artifacts and can share them across apps in the same App Group. For transformers, Core AI adds model states and in-place KV caching to avoid recomputing context. A dedicated Core AI Debugger visualizes execution, inspects tensors, and traces operations back to the original Python source.
What ships today: Apple's open-source apple/coreai-models package includes ready pipelines for Qwen3, the Mistral family, and SAM3. These surface through CoreAILanguageModel as Foundation Models providers, so the same @Generable struct, streaming, and tool-call API work against a custom model you ship.
Limits: PyTorch source is required (no direct GGUF/weights-only conversion); torch.export is strict about dynamic shapes and control flow; custom CUDA kernels must be ported to Metal; there's roughly a 1 GB post-compression memory ceiling on iPhone (caps practical size around 10–30B at int4; Mac is much higher); no automatic .mlmodel → .aimodel converter; some open-weight licenses (notably Llama) prohibit App Store redistribution; and first-load specialization is slow — schedule it via Background Assets.
The LanguageModel protocol is the unifying abstraction. Anything that conforms — Apple's models, a custom Core AI model, Claude, or Gemini — drops into the same LanguageModelSession, and the downstream call sites stay identical. Five providers ship at WWDC 2026:
What it enables: provider-tier routing inside one app (cheap on-device for triage, PCC for harder tasks, Claude/Gemini for frontier needs), per-customer model selection in B2B without code changes, failover patterns, and future-proof shells where the "best model" is a runtime config.
Gotchas: provider parity isn't guaranteed (Claude can throw .unsupportedGenerationGuide for structured outputs Apple supports); API keys must use proxied auth in production; you own the third-party token bill; tool support varies (Claude exposes .webSearch/.webFetch/.codeExecution; Apple's on-device model has OCRTool, BarcodeReaderTool, SpotlightSearchTool); and OpenAI is an Xcode 27 agent option but not yet a first-party LanguageModel conformer.
Attachment(image) in the prompt builder (UIImage, CGImage, CVPixelBuffer, file URLs, and more), with OCRTool() and BarcodeReaderTool() built in, and can fill a @Generable struct directly from a photo. Image attachments cost tokens proportional to size, so downscale before sending. It returns text/structured output, not pixel masks — use Vision's segmentation requests for masks — and devices on plain AFM 3 Core fall back to text-only.SpeechTranscriber, DictationTranscriber, SpeechDetector) now feeds transcripts straight into LanguageModelSession. The fully on-device shape — transcribe locally, then summarize/structure with AFM 3 — means audio never leaves the device, which unlocks meeting recorders, voice journals, field-inspection reports, and medical/legal session notes. (For direct audio input you still route to a provider like Gemini.)Keeping PCC free requires both conditions: enrollment in the App Store Small Business Program (under $1M/year) and fewer than 2 million lifetime first-time downloads across all apps. A single app crossing 2M downloads exits the free tier; Apple's stated intent is to publish above-threshold pricing before any developer is forced off.
The AI-capable base is the largest it has ever been at a feature launch — but ~30% on non-eligible devices is not an edge case, and much of it is recent hardware still in active sale. Build for graceful degradation.
It depends on what you're waiting for. The consumer-facing Siri AI assistant is still delayed, and in the EU it's restricted for DMA reasons. But for developers, WWDC 2026 delivered: a free on-device model on every capable iPhone, a privacy-grade cloud tier, a framework for shipping custom models on-device, and one Swift protocol that makes providers interchangeable.
For most teams, yes. On-device models (AFM 3 Core and Core Advanced) and custom Core AI models carry no metering. Private Cloud Compute is free for App Store Small Business Program developers under 2 million lifetime downloads. The only metered costs come from deliberately routing to Anthropic Claude or Google Gemini.
Foundation Models gives you Apple's built-in on-device models (and PCC access) through a Swift API. Private Cloud Compute is Apple's privacy-grade cloud tier for heavier tasks that need more context or chain-of-thought reasoning. Core AI is the framework for converting and running your own custom models — open-source or proprietary — directly on the device.
Yes. Core AI converts a PyTorch model to Apple's .aimodel format, compresses it (often ~85% smaller at int4), and ships it via Background Assets. The model and its data never leave the device — which is what makes it viable for regulated industries like healthcare, legal, and finance.
AFM 3 Core runs on iPhone 15 Pro and newer (~65–70% of US iPhones). The higher-capability AFM 3 Core Advanced, which adds image input, requires top-tier devices like the iPhone Air and iPhone 17 Pro (~25–30%). The iPhone 14 line and base iPhone 15 are not eligible, so plan a degradation path.
It makes the model a runtime choice instead of a compile-time dependency. The same LanguageModelSession and downstream code can target Apple's on-device model, PCC, a custom Core AI model, Claude, or Gemini — so switching providers, or supporting different providers per B2B customer, is a configuration change rather than a rewrite.
Apple-original sources
WWDC 2026 sessions cited: 241 (What's new in the Foundation Models framework), 319 (Build with the new Apple Foundation Model on PCC), 324 (Meet Core AI), 325 (Dive into Core AI model authoring and optimization), 326 (Integrate on-device AI models into your app using Core AI), 339 (Bring an LLM provider to the Foundation Models framework).
Third-party context: 9to5mac (on-device AI explainer), appcircle.io (Core AI framework explained), and the open-source apple/coreai-models recipes (Qwen3, Mistral, SAM3). Third-party LanguageModel conformers: ClaudeForFoundationModels (GitHub) and the Firebase Apple SDK for Gemini.
For the past two years, the AI story in the Apple ecosystem has followed the same pattern — a staggered Apple Intelligence rollout, a delayed Siri rebuild, "personal context" landing only partially across releases, and the EU getting almost none of it. Heading into WWDC 2026, the question was simple: is this finally the year Apple delivers, or just another chapter of delay?
From my perspective as an iOS engineer at NineTwoThree, here's the honest answer — for the consumer-facing Siri AI experience, the wait continues. But for on-device developer AI, this is genuinely the year the platform landed, and in a way that changes the economics of mobile AI features we'd previously written off as too expensive to ship.
Apple opened the Foundation Models framework to third-party apps. Every Apple Intelligence-capable iPhone now ships with a 3-billion-parameter on-device model — AFM 3 Core — accessible through a native Swift API. It handles writing, summarization, translation across 25 languages, content classification, and tool calls, running at roughly 30 tokens per second on an iPhone 15 Pro. No cloud bill, no API key, no quota.
For heavier tasks, Private Cloud Compute (PCC) is now developer-accessible — a cryptographically attested environment where Apple can't read the request, verified independently by security researchers. And here's the part that changes the calculation for us: for developers in the App Store Small Business Program with fewer than 2 million lifetime downloads, PCC is free. The per-token cloud cost that has historically killed mobile AI features isn't really a blocker anymore for most teams.
There's also a new unifying Swift protocol — LanguageModel — that lets the same session and API run against Apple's on-device model, PCC, MLX, Anthropic Claude, or Google Gemini. Switching providers becomes a one-line change rather than a rewrite.
For about three years the industry consensus has been simple: real AI runs in the cloud, and mobile clients are thin wrappers. That held because per-token cloud inference was expensive, model files were too big to ship to a phone, and on-device hardware couldn't run a usable language model. WWDC 2026 broke all three assumptions at once.
The net effect: a lot of features we rejected as "too expensive to operate" two years ago are now zero-bill, on-device features.
The single biggest commercial shift is the cost model. Most of the stack is now free for most teams. On-device models (AFM 3 Core and Core Advanced) carry no metering at all. PCC is free under the Small Business Program below 2M lifetime downloads. Custom on-device models built with Core AI cost nothing to run — the user downloads them once via Background Assets. The only metered costs are deliberate choices to route to Claude or Gemini for capabilities Apple's models don't yet match.
That reframes the conversation we have with clients. Features that used to come with an open-ended per-image or per-token operating bill — receipt parsing, photo-to-structured-data, message rewriting, content tagging — are now fixed-cost or free, and they run on the phone.
There's a second story that matters even more for the healthcare, legal, and finance clients we work with: the privacy posture is different now. For years the blocking question on any regulated build has been "can we send this data to a cloud LLM?" — and the honest answer was usually no.
On this stack, that question finally has a defensible answer. On-device inference means the data never leaves the phone. PCC means that when you do need the cloud, Apple can cryptographically prove — to the client and to independent researchers — that no one, including Apple, can read the request. And Core AI means a client's proprietary model (a medical imaging classifier, a fraud-detection model, a domain-specific LLM they own) can ship to a phone without the model or its data ever touching a third-party vendor. That path simply didn't exist cleanly before.
A few patterns are immediately viable, and I'd reach for them in this order:
This isn't unqualified good news, and it's worth being straight about the limits.
Device eligibility is the big one. Apple Intelligence — the gate on Foundation Models access — runs on roughly 65–70% of US iPhones today (iPhone 15 Pro and newer). The higher-capability AFM 3 Core Advanced model, which adds image input, reaches only ~25–30%. That leaves roughly a third of users on devices that get nothing — and many of them are on recent hardware like the iPhone 14 or the base iPhone 15, which is still on sale. Any feature you build needs a graceful degradation path.
The free tier has a ceiling. Cross 2 million lifetime downloads and you exit the free PCC tier. Apple hasn't published pricing above that line yet, so treat it as TBD in any planning conversation.
The EU gets the frameworks but not the assistant. Foundation Models, PCC, Core AI, and the LanguageModel protocol all work in the EU. What's restricted is the consumer Siri AI assistant layer, which Apple attributes to DMA compliance. The developer-facing stack is not regionally gated.
If you were waiting for Siri to become the AI assistant Apple has been promising, this was another year on hold. But if you build apps, WWDC 2026 was the window opening. The platform now hands you free on-device inference, a privacy-grade cloud tier, and a clean path to ship custom models — all behind native Swift APIs. The features we used to price out of existence are suddenly shippable. That's the part worth acting on now.
This is exactly the kind of shift we help teams turn into shipped product. If you're weighing what's now possible on-device — especially in a regulated industry — talk to our team about your AI roadmap.
For the engineers: here's what each piece of the stack does, where the limits are, and how the pieces compose. Scope is intentionally narrow — the AI inference surfaces available to a third-party iOS developer, on-device or in PCC.
Apple Foundation Models 3 Core is the on-device model bundled with iOS 27, accessible through the FoundationModels Swift framework. It's a true small LLM, not a feature-specific classifier, and it runs entirely on the device.
@Generable macro and @Guide field hints..refusal and .guardrailViolation errors are part of the contract; English-first at launch, with the other 24 languages rolling in over the cycle.The framework defaults to AFM 3 Core via SystemLanguageModel.default — swapping to PCC or a third-party provider is a one-line change to the model parameter.
The high-capability on-device variant is a 20-billion-parameter sparse Mixture-of-Experts model that activates just 1–4B parameters per token. Apple fits it on a phone by storing most of it in flash (NAND) rather than DRAM, using a lightweight dense block to select experts dynamically. Apple credits two techniques: Instruction-Following Pruning (IFP) for deployment beyond DRAM constraints, and Quantization Aware Training (QAT) to preserve accuracy at lower precision.
PCC is the cloud half of the stack — not a traditional cloud LLM, but a cryptographically attested compute environment where Apple proves no party (including Apple) can read the data. Announced for Apple's own use at WWDC 2024, it opened to third-party developers at WWDC 2026. It hosts three models: AFM 3 Cloud (the server-side workhorse), ADM 3 Cloud (Image) for image generation/editing, and AFM 3 Cloud Pro (the most capable, for agentic tool use and complex reasoning). Notably, AFM 3 Cloud Pro runs on Google Cloud infrastructure with NVIDIA GPUs — the first time PCC has extended beyond Apple silicon — with privacy guarantees maintained via the same attestation.
.light, .moderate, .deep), multi-step planning and agentic tool-calling, long-document summarization, multi-page OCR, and multi-image reasoning.model.quotaUsage.status before invoking, disable AI-bound buttons near the limit, and surface an iCloud+ upsell where appropriate. Xcode 27 can simulate availability and quota states.Core AI is a new framework (distinct from, and not replacing, the older Core ML) built for generative-era workloads: LLMs in the 10B+ range, image segmentation models, and custom domain models. Core ML stays for its legacy image-classifier and tabular cases.
The pipeline: convert a PyTorch model to Apple's .aimodel format via torch.export; compress with coreai-opt (int4 per-channel symmetric is the standard preset, with K-means palletization for embeddings and FP4/FP8 for sensitive layers); AOT-compile per target device with xcrun coreai-build; distribute via Background Assets, not the app bundle; and run through high-level wrappers like CoreAILanguageModel or CoreAIImageSegmenter.
The headline demo — SAM3: Meta's Segment Anything Model 3 ran on iPhone after compressing from ~3 GB to ~430 MB at int4 (about an 85% reduction) with no meaningful quality loss, plus a 76% speedup from cached image-encoder reuse via the multi-entrypoint asset feature.
API and lifecycle: three core types — AIModel, InferenceFunction, and NDArray — built on memory-safe, non-escapable Swift types. Specialization (per-device compile) is a formal lifecycle step that can be slow on first load, so trigger it ahead of time; AIModelCache manages cached artifacts and can share them across apps in the same App Group. For transformers, Core AI adds model states and in-place KV caching to avoid recomputing context. A dedicated Core AI Debugger visualizes execution, inspects tensors, and traces operations back to the original Python source.
What ships today: Apple's open-source apple/coreai-models package includes ready pipelines for Qwen3, the Mistral family, and SAM3. These surface through CoreAILanguageModel as Foundation Models providers, so the same @Generable struct, streaming, and tool-call API work against a custom model you ship.
Limits: PyTorch source is required (no direct GGUF/weights-only conversion); torch.export is strict about dynamic shapes and control flow; custom CUDA kernels must be ported to Metal; there's roughly a 1 GB post-compression memory ceiling on iPhone (caps practical size around 10–30B at int4; Mac is much higher); no automatic .mlmodel → .aimodel converter; some open-weight licenses (notably Llama) prohibit App Store redistribution; and first-load specialization is slow — schedule it via Background Assets.
The LanguageModel protocol is the unifying abstraction. Anything that conforms — Apple's models, a custom Core AI model, Claude, or Gemini — drops into the same LanguageModelSession, and the downstream call sites stay identical. Five providers ship at WWDC 2026:
What it enables: provider-tier routing inside one app (cheap on-device for triage, PCC for harder tasks, Claude/Gemini for frontier needs), per-customer model selection in B2B without code changes, failover patterns, and future-proof shells where the "best model" is a runtime config.
Gotchas: provider parity isn't guaranteed (Claude can throw .unsupportedGenerationGuide for structured outputs Apple supports); API keys must use proxied auth in production; you own the third-party token bill; tool support varies (Claude exposes .webSearch/.webFetch/.codeExecution; Apple's on-device model has OCRTool, BarcodeReaderTool, SpotlightSearchTool); and OpenAI is an Xcode 27 agent option but not yet a first-party LanguageModel conformer.
Attachment(image) in the prompt builder (UIImage, CGImage, CVPixelBuffer, file URLs, and more), with OCRTool() and BarcodeReaderTool() built in, and can fill a @Generable struct directly from a photo. Image attachments cost tokens proportional to size, so downscale before sending. It returns text/structured output, not pixel masks — use Vision's segmentation requests for masks — and devices on plain AFM 3 Core fall back to text-only.SpeechTranscriber, DictationTranscriber, SpeechDetector) now feeds transcripts straight into LanguageModelSession. The fully on-device shape — transcribe locally, then summarize/structure with AFM 3 — means audio never leaves the device, which unlocks meeting recorders, voice journals, field-inspection reports, and medical/legal session notes. (For direct audio input you still route to a provider like Gemini.)Keeping PCC free requires both conditions: enrollment in the App Store Small Business Program (under $1M/year) and fewer than 2 million lifetime first-time downloads across all apps. A single app crossing 2M downloads exits the free tier; Apple's stated intent is to publish above-threshold pricing before any developer is forced off.
The AI-capable base is the largest it has ever been at a feature launch — but ~30% on non-eligible devices is not an edge case, and much of it is recent hardware still in active sale. Build for graceful degradation.
It depends on what you're waiting for. The consumer-facing Siri AI assistant is still delayed, and in the EU it's restricted for DMA reasons. But for developers, WWDC 2026 delivered: a free on-device model on every capable iPhone, a privacy-grade cloud tier, a framework for shipping custom models on-device, and one Swift protocol that makes providers interchangeable.
For most teams, yes. On-device models (AFM 3 Core and Core Advanced) and custom Core AI models carry no metering. Private Cloud Compute is free for App Store Small Business Program developers under 2 million lifetime downloads. The only metered costs come from deliberately routing to Anthropic Claude or Google Gemini.
Foundation Models gives you Apple's built-in on-device models (and PCC access) through a Swift API. Private Cloud Compute is Apple's privacy-grade cloud tier for heavier tasks that need more context or chain-of-thought reasoning. Core AI is the framework for converting and running your own custom models — open-source or proprietary — directly on the device.
Yes. Core AI converts a PyTorch model to Apple's .aimodel format, compresses it (often ~85% smaller at int4), and ships it via Background Assets. The model and its data never leave the device — which is what makes it viable for regulated industries like healthcare, legal, and finance.
AFM 3 Core runs on iPhone 15 Pro and newer (~65–70% of US iPhones). The higher-capability AFM 3 Core Advanced, which adds image input, requires top-tier devices like the iPhone Air and iPhone 17 Pro (~25–30%). The iPhone 14 line and base iPhone 15 are not eligible, so plan a degradation path.
It makes the model a runtime choice instead of a compile-time dependency. The same LanguageModelSession and downstream code can target Apple's on-device model, PCC, a custom Core AI model, Claude, or Gemini — so switching providers, or supporting different providers per B2B customer, is a configuration change rather than a rewrite.
Apple-original sources
WWDC 2026 sessions cited: 241 (What's new in the Foundation Models framework), 319 (Build with the new Apple Foundation Model on PCC), 324 (Meet Core AI), 325 (Dive into Core AI model authoring and optimization), 326 (Integrate on-device AI models into your app using Core AI), 339 (Bring an LLM provider to the Foundation Models framework).
Third-party context: 9to5mac (on-device AI explainer), appcircle.io (Core AI framework explained), and the open-source apple/coreai-models recipes (Qwen3, Mistral, SAM3). Third-party LanguageModel conformers: ClaudeForFoundationModels (GitHub) and the Firebase Apple SDK for Gemini.
