What's the difference between Apple Core AI, Foundation Models, and MLX?

Three layers of the iOS 27 on-device AI stack, with different jobs. Foundation Models is the sealed system LLM Apple ships with the OS — you call into it, you don't choose the model, and the API is high-level (generate text, structured output, tool use). Core AI is the lower-level inference framework introduced at WWDC 2026 — you bring your own model file, specialize it for the target hardware, and run inference with explicit control over CPU/GPU/Neural Engine scheduling. MLX is Apple's open-source array framework for training and research, the layer underneath both, where you build and train custom models on Apple silicon. As a rule: Foundation Models for default AI features in your app, Core AI when you need to ship a specific third-party or custom model on-device, MLX when you're training.

Should I use Apple Foundation Models or call out to Claude / ChatGPT instead?

Use Foundation Models when your app needs simple on-device intelligence — text summarization, structured output extraction, writing tools — without latency or privacy concerns of a server round-trip. It's free, runs offline, and ships with the OS. Use cloud Claude / ChatGPT / Gemini when you need frontier capability your app's users will pay for (Foundation Models is intentionally a small on-device model and cannot match Fable 5 or GPT-5.5 on hard reasoning), or when your app already integrates with a cloud LLM provider for cross-platform parity. Hybrid is common in production iOS apps in 2026: Foundation Models for fast on-device tasks, cloud LLM for hard tasks gated behind a paywall.

Did Apple actually replace Core ML with Core AI?

Not replace — added alongside. Core ML is still present in iOS 27 for traditional ML model deployment (vision, audio, time-series, traditional ML pipelines converted from frameworks like PyTorch or TensorFlow). Core AI is a new framework that sits below Foundation Models and Core ML for cases where developers want explicit control over LLM-style model execution on Apple silicon. The distinction Apple draws: Core ML is the converter+runtime for fixed models with the converter making hardware decisions; Core AI gives you the model-execution surface where you drive specialization, caching, and inference scheduling yourself. Both will coexist in iOS 27, with Apple positioning Core AI as the right choice for new LLM-style integrations and Core ML for existing converted-model workflows.

Can I run Llama 5 or Qwen 3.6 on iPhone using Core AI?

Technically yes for smaller models; practically constrained by device RAM. Core AI gives you the Swift API to load model assets (.aiasset format), specialize them for Apple silicon, and run inference across the CPU, GPU, and Neural Engine. The constraint is RAM: iPhone 17 Pro ships with 12 GB unified memory; iPad Pro M5 with 16 GB. That fits 7B parameter models comfortably at 4-bit quantization (~4 GB), 13B models with tight headroom, and rules out 70B+ models. Practical recipe in mid-2026: take a 3B–7B parameter open model (Llama 5 8B, Qwen 3.6 7B, Gemma 4 4B), quantize to 4-bit via MLX, package as an .aiasset, ship with your app or download on first run. Inference at conversational latency on Neural Engine is achievable. The MLX community has working examples for most current open-weight models.

Quick Answer

Apple Core AI vs Foundation Models vs MLX: Which iOS 27 AI Framework (June 2026)

Published: June 15, 2026

Apple Core AI vs Foundation Models vs MLX: Which iOS 27 AI Framework (June 2026)

Apple’s WWDC 2026 reorganized the on-device AI stack into three distinct layers: Foundation Models for system-LLM features, Core AI for custom model execution, and MLX for research and training. This page explains what each does, when to use which, and the practical limits in iOS 27.

Last verified: June 15, 2026, based on WWDC 2026 sessions (June 8–12) and developer documentation released alongside iOS 27 developer beta 1.

TL;DR

Foundation Models — Apple’s sealed system LLM. Free, ships with iOS 27, high-level API. Use for default on-device intelligence in apps.
Core AI — Low-level inference framework. Bring your own model. Use when you ship a specific third-party or custom model on iPhone/iPad/Mac.
MLX — Open-source array framework. Use for training, research, and quantizing models before deploying via Core AI.
Core ML — Still around for traditional ML conversion workflows.

The new stack at a glance

Layer	Surface	Model	Who decides quantization	Use case
Foundation Models	High-level Swift API	Apple’s on-device LLM (sealed)	Apple	Default LLM features in any app
Core AI	Mid-level Swift API	You bring your own (.aiasset)	You, via SpecializationOptions	Ship custom or third-party LLM on-device
Core ML	High-level Swift API	You bring your own (converted .mlmodel)	Core ML converter	Traditional ML — vision, audio, classification
MLX	Python + Swift framework	Anything you build	You	Training, research, model conversion

When to use Foundation Models

Foundation Models is the easiest path and the default Apple wants most apps to take. iOS 27 gives every app access to:

Text generation — single-shot completion, streaming responses, conversational turn-taking
Structured output — Swift type-safe outputs via the @Generable macro
Tool use — your app declares Swift functions; the model calls them
Image input — Foundation Models can now accept images alongside text (new in iOS 27)
Custom skills — fine-tune behavior with developer-supplied skill definitions
Server option — same API surface can run on Apple’s Private Cloud Compute servers when the on-device model is insufficient

The big trade-off: you don’t pick the model. Apple ships one on-device LLM and one server LLM, and you call them. They are intentionally small-to-medium models tuned for Apple’s privacy + battery constraints. They are not GPT-5.5 / Fable 5 / Gemini 3.5 Pro replacements.

Use Foundation Models when:

You want “smart text” features without managing model files
Latency, privacy, and offline operation matter more than peak quality
Your app should “just work” on every iOS 27 device

When to use Core AI

Core AI is the right tool when you need a specific model on-device that Foundation Models doesn’t ship. Apple’s framing: Core AI is the same inference framework that runs Apple Intelligence, now opened up for your app.

The Swift API revolves around five concepts:

AIModelAsset — unspecialized model file; inspect structure and metadata cheaply
AIModel — specialized model for a specific device; runs inference
AIModelCache — stores device-specific artifacts so you don’t re-specialize on every launch
InferenceFunction — owns the weights and buffers; Sendable so you can run concurrently
ComputeUnitKind / SpecializationOptions — target CPU, GPU, or Neural Engine

The workflow: load an .aiasset model file → specialize it for the user’s device (one-time, cached) → instantiate an InferenceFunction → run inference with NDArray inputs and outputs.

Use Core AI when:

You’re shipping a specific model — your own, or a third-party open model
You need control over which compute unit runs inference
Foundation Models is too generic or too small for your workload

Practical examples mid-2026:

A photography app shipping a domain-tuned image-captioning model
A medical app running a specialized clinical-language model with HIPAA-relevant restrictions
A coding assistant shipping a code-specific 7B model for offline use

When to use MLX

MLX is the layer underneath, used by developers and researchers who train models or convert open-weight models for Apple silicon deployment. MLX gives you:

NumPy-like array operations on Apple silicon
Automatic differentiation
Unified memory model that avoids host↔device copies
Quantization to 4-bit and 8-bit for deployment
Direct export paths to .aiasset for Core AI deployment

You don’t ship MLX in a consumer app — you use MLX in your build pipeline to produce the model file that Core AI loads. The community ecosystem around MLX in mid-2026 has working ports of most current open-weight models (Llama 5, Qwen 3.6, Gemma 4, Mistral, DeepSeek V4 Flash, Phi family).

Use MLX when:

You’re training or fine-tuning a custom model
You need to convert an open-weight model to Apple silicon
You’re doing research on Apple hardware

Where Core ML still fits

Core ML hasn’t gone anywhere. iOS 27 keeps full support for .mlmodel files from coremltools conversions. The split:

Core ML — traditional ML model deployment, especially models converted from PyTorch / TensorFlow via coremltools. Fixed model, converter picks hardware. Best for vision classifiers, audio models, time-series, classical ML pipelines.
Core AI — LLM-style model execution where you want explicit control. Best for new generative AI integrations.

Existing Core ML deployments stay valid. Apple has not announced a deprecation timeline.

Practical sizing on current devices

Mid-2026 hardware shipping with iOS 27:

Device	Unified RAM	Comfortable on-device model size
iPhone 17 Pro	12 GB	7B-13B at 4-bit quantization
iPhone 17 (non-Pro)	8 GB	3B-7B at 4-bit
iPhone 16 series (most)	8 GB	3B-7B at 4-bit
iPad Pro M5	16 GB	13B-30B at 4-bit
MacBook Pro M5 Max	64-128 GB	70B+ at 4-bit

Apple Intelligence eligible devices (iPhone 15 Pro and later, plus Apple silicon iPads and Macs) get Foundation Models. Core AI runs on the same eligibility set.

Decision flow

Question 1: Does Foundation Models do what you need?
  Yes → Foundation Models. Done.
  No  → Continue.

Question 2: Do you need to ship a specific model?
  Yes → Core AI (production), MLX (build pipeline).
  No  → Continue.

Question 3: Is your model a converted traditional-ML pipeline?
  Yes → Core ML.
  No  → Core AI is probably the right answer.

Question 4: Are you training or doing research?
  Yes → MLX.

What competitors offer

For context — what other platforms ship in June 2026:

Google Android 16 — ML Kit + Gemini Nano on-device. Comparable to Foundation Models, less developer flexibility for custom models.
Windows 11 25H2 / Copilot+ PCs — Windows Agent Runtime + DirectML for custom model deployment. Comparable to Core AI.
Cross-platform — ONNX Runtime, llama.cpp, MLC LLM all work on Apple silicon but bypass Apple’s Neural Engine; Core AI is the right choice when you want NPU acceleration.

The big picture

The iOS 27 split — Foundation Models for default, Core AI for custom, MLX for training — is Apple’s bet that most developers want simplicity (Foundation Models) but a meaningful minority will ship custom models (Core AI). Mid-2026, both adoption paths look viable. If you’re starting an iOS app today and want any AI feature, default to Foundation Models. Reach for Core AI when you’ve hit a wall.

iOS 27 is in developer beta as of WWDC 2026. APIs may change before fall public release.

Apple Core AI vs Foundation Models vs MLX: Which iOS 27 AI Framework (June 2026)

TL;DR

The new stack at a glance

When to use Foundation Models

When to use Core AI

When to use MLX

Where Core ML still fits

Practical sizing on current devices

Decision flow

What competitors offer

The big picture

Related reading