Which is the cheapest frontier-class model in June 2026?

Gemini 3.5 Flash is the cheapest of the three frontier-class fast models at $1.50 input / $9.00 output per million tokens. GPT-5.5-mini is roughly $1–3 input / $4–10 output. Claude Haiku 4.5 is roughly $1 input / $5 output. For pure cost per million tokens, GPT-5.5-mini and Claude Haiku 4.5 are slightly cheaper than Gemini 3.5 Flash on input, but Gemini 3.5 Flash has the highest reasoning quality of the three — which matters more for cost-per-task than cost-per-token.

What changed for cheap fast models in mid-2026?

The 'cheap tier' got dramatically smarter. Gemini 3.5 Flash launched May 19, 2026 at Google I/O scoring 76.2% on Terminal-Bench 2.1 — beating Gemini 3.1 Pro on coding and agentic benchmarks. Claude Haiku 4.5 (March 2026) matches Sonnet 4 on most reasoning tasks. GPT-5.5-mini matches GPT-5 on many tasks. The cheap tier now does work the flagship tier did 6–12 months ago, at 5–20x lower cost.

Should I use Gemini 3.5 Flash for production agents?

Yes, in many cases. Gemini 3.5 Flash hit 83.6% on MCP Atlas (agentic tool use), runs 4x faster than comparable frontier models, and is GA on Vertex AI and AI Studio. Use it for high-volume agent loops where you'd otherwise pay 5x more for Claude Opus 4.8. Stay on Opus 4.8 for autonomous coding agents and long autonomous runs where Dynamic Workflows (up to 1000 subagents) matter.

When should I pick Claude Haiku 4.5 over Gemini 3.5 Flash?

Pick Claude Haiku 4.5 when you're already in the Anthropic SDK / Claude Code ecosystem, when refusal calibration matters (regulated workloads, content moderation), or when you need consistent Anthropic-family behavior at lower cost. Pick Gemini 3.5 Flash when you need 1M-token context, multimodal vision, or the cheapest per-task cost on agentic workloads.

Quick Answer

Gemini 3.5 Flash vs Claude Haiku 4.5 vs GPT-5.5-mini June 2026

Published: June 9, 2026

Gemini 3.5 Flash vs Claude Haiku 4.5 vs GPT-5.5-mini: The Cheap-Fast Showdown

Three cheap-fast frontier-class models, three labs, very different strategies. In June 2026, the “Flash / Haiku / mini” tier matters more than the flagship tier for most production workloads because the cheap models now do work the flagship models did 6–12 months ago. Here’s how to pick.

Last verified: June 9, 2026

TL;DR

You’re building…	Pick
Cheapest agent at high volume	Gemini 3.5 Flash
Already on Anthropic SDK	Claude Haiku 4.5
Cheap ChatGPT-ecosystem chat / Codex calls	GPT-5.5-mini
Long-context summarization (>500k tokens)	Gemini 3.5 Flash
Tight refusal calibration / regulated	Claude Haiku 4.5
Multimodal vision at low cost	Gemini 3.5 Flash

Side-by-side

	Gemini 3.5 Flash	Claude Haiku 4.5	GPT-5.5-mini
Vendor	Google	Anthropic	OpenAI
Released	May 19, 2026	March 2026	Late 2025
Input price (per 1M tokens)	$1.50	~$1.00	~$1–3
Output price (per 1M tokens)	$9.00	~$5.00	~$4–10
Context window	1M tokens	200k tokens	256k tokens
Speed	4x frontier baseline	Fast	Fast
Coding (Terminal-Bench 2.1)	76.2%	~60%	~65%
Agent (MCP Atlas)	83.6%	~70%	~72%
Reasoning (CharXiv)	84.2%	~70%	~75%
Multimodal	Best (image, video, doc)	Image, doc	Image, voice
Available on	Vertex AI, AI Studio	Anthropic API, Bedrock	OpenAI API, Azure
Free tier	AI Studio free	Claude.ai free w/ limits	ChatGPT free w/ limits
`thinking_level` / extended thinking	Yes (new API in 3.5)	Yes	No (separate o-series for reasoning)

What changed with Gemini 3.5 Flash

Gemini 3.5 Flash is the headline shock. Launched May 19, 2026 at Google I/O and GA on Vertex AI on June 7, 2026. Per Google’s published numbers:

76.2% Terminal-Bench 2.1 — beats Gemini 3.1 Pro (the previous flagship)
83.6% MCP Atlas — leading agentic tool-use score
84.2% CharXiv Reasoning — top of class for the price
4x faster than comparable frontier models
$1.50/$9 per million tokens — 3x more than Gemini 3.0 Flash but 5–10x cheaper than Opus 4.8
1M token context — same as Gemini 3 Pro
New thinking_level API — toggle extended reasoning per request

Google’s positioning: Flash is now better at coding and agents than the previous flagship Pro. That’s the first time a “cheap tier” model has crossed that line.

What Claude Haiku 4.5 is good at

Claude Haiku 4.5 (March 2026) is Anthropic’s quiet workhorse:

Tightest refusal calibration — important for regulated workloads
Matches Sonnet 4 on most reasoning at 1/3 the price
First-class in Claude Code for high-volume routine tasks (Claude Code uses Haiku for non-coding subagent fan-out)
Excellent SDK ergonomics — same SDK as Opus 4.8, easy to swap
Available on Anthropic API + AWS Bedrock + Vertex AI (Anthropic is multi-cloud)

Weakness: 200k context (vs 1M for Gemini), weaker multimodal vision than Flash.

What GPT-5.5-mini is good at

GPT-5.5-mini is OpenAI’s cheap general-purpose model:

Tight integration with Codex CLI and ChatGPT extensions
Best Function Calling v2 at the cheap tier
Sora 2 + voice + image — all OpenAI multimodal endpoints work at mini tier
Massive distribution — ChatGPT has 800M+ weekly users
Good cost-per-task at chat workloads

Weakness: not GA on Azure for all regions yet (June 2026), and OpenAI’s pricing has been moving — check current rate card.

Cost-per-task math

For agent-style workloads, the right metric isn’t $/token, it’s $/task-completed. Rough numbers from production benchmarks:

Workload	Gemini 3.5 Flash	Claude Haiku 4.5	GPT-5.5-mini
MCP tool-use task	~$0.001	~$0.002	~$0.002
Code review 1k LOC	~$0.005	~$0.006	~$0.005
Summarize 100k-token PDF	~$0.15	~$0.20 (truncated)	~$0.25
Long autonomous agent run (1 hour)	~$2–5	~$3–7	~$3–6

Gemini 3.5 Flash wins on long-context summarization (because 1M context, no truncation). Claude Haiku and GPT-5.5-mini are similar on chat. Cost-per-task differences depend more on how often the model gets it right than on per-token price.

When to escalate to flagship

Use the cheap tier by default. Escalate to flagship when:

Multi-hour autonomous coding (Claude Opus 4.8 + Dynamic Workflows)
Frontier-grade reasoning over big codebases (Opus 4.8 or Gemini 3 Pro)
ChatGPT role plugins / Codex Sites (GPT-5.5)
Enterprise compliance requiring specific model tier

A common pattern: orchestrate with Opus 4.8 or GPT-5.5, fan out to Gemini 3.5 Flash / Haiku 4.5 / GPT-5.5-mini for subagents.

Picking your default

If your team…	Default model
Lives in Google Cloud	Gemini 3.5 Flash
Lives in AWS or already uses Anthropic SDK	Claude Haiku 4.5
Lives on OpenAI API + ChatGPT	GPT-5.5-mini
Doesn’t care about cloud	Gemini 3.5 Flash (cheapest per-task, best benchmarks at price)

The brutal truth in June 2026: Gemini 3.5 Flash is just better than the other two on the published benchmarks at a comparable price. If you’re greenfield and don’t have a vendor preference, start with Flash.

Sources

llm-stats.com: Gemini 3.5 Flash launch + benchmarks
Build Fast with AI: Gemini 3.5 Flash review (benchmarks, pricing)
digitalapplied.com: Gemini 3.5 Flash benchmarks, thinking, API
aitoolsrecap.com: Gemini 3.5 Flash review June 2026
codersera.com: Gemini 3.5 complete guide 2026
Anthropic: Haiku 4.5 model card
OpenAI: GPT-5.5-mini API documentation
nerdleveltech.com: Gemini 3.5 Flash benchmarks pricing