AI agents · OpenClaw · self-hosting · automation

Quick Answer

What Is Gemini 3.5 Flash? Google's New Default AI Model (May 2026)

Published:

What Is Gemini 3.5 Flash? Google’s New Default AI Model (May 2026)

Gemini 3.5 Flash launched on May 19, 2026 at Google I/O and is now the default model for the Gemini app and AI Mode in Google Search. It’s Google’s first “frontier-class small model” — claiming to beat Gemini 3.1 Pro on most agentic benchmarks while running up to 4x faster and at a fraction of the cost.

Last verified: May 20, 2026

Quick facts

PropertyValue
AnnouncedMay 19, 2026 (Google I/O 2026)
VendorGoogle DeepMind
Model familyGemini 3.5
Input context1,000,000 tokens
Output tokens64,000
SpeedUp to 4x faster than other frontier models (output TPS)
SWE-bench Pro55.1% (single-attempt)
Terminal-Bench 2.176.2%
Default inGemini app, AI Mode in Search, Antigravity 2.0
PowersGemini Spark, Information Agents in Search

What’s actually new

Gemini 3.5 Flash isn’t a Pro-class model and Google isn’t pretending it is. The story is different: a small, fast, cheap model that hits frontier-class numbers on the workloads that matter for agents.

  • Agent-first — tuned for tool use, planning, and long-horizon multi-step tasks. It’s the model that powers Antigravity 2.0 by default and Gemini Spark.
  • 4x output speed — Google says output tokens per second is up to 4x faster than competing frontier models. For agentic loops that call the model repeatedly, this changes the unit economics.
  • 1M-token input window — same as Gemini 1.5 Pro, but in a Flash-tier model.
  • Beats 3.1 Pro on most evals — this is Google’s main marketing claim: a Flash model that outperforms the previous-generation Pro on agentic and coding tasks.

Benchmarks

BenchmarkGemini 3.5 FlashGemini 3.1 ProGPT-5.5Claude Opus 4.7
SWE-bench Pro (single attempt)55.1%~50%58.6%64.3%
Terminal-Bench 2.176.2%70.3%78.2%66.1%
Output tokens per second~4x baselinebaselinebaselinebaseline
Context window (input)1M1M~400K200K

Read this honestly: Opus 4.7 is still the best coding model. GPT-5.5 leads on terminal-style agentic tasks. Gemini 3.5 Flash’s superpower is speed + price + context length — it’s the model you want in the loop when you’re running thousands of agent steps.

Where Gemini 3.5 Flash shows up

SurfaceWhat it does
Gemini appNew default model (replaces 3.1 Pro for free + Pro tiers)
AI Mode in Google SearchPowers the new conversational search box + Information Agents
Antigravity 2.0Default model in Google’s agent-first IDE
Gemini SparkThe reasoning engine for Google’s 24/7 personal AI agent
Gemini API / Vertex AIGenerally available for developers
JulesBacks the free tier (3 concurrent / 15 daily tasks)

Pricing

Gemini 3.5 Flash is cheaper than 3.1 Pro and Google’s stated pitch is “frontier-class quality at Flash prices.” Exact per-token API prices are listed at ai.google.dev — for most Pro-tier users, the practical impact is higher rate limits + faster responses at the same Gemini app subscription price.

The new Google AI subscription stack as of May 20, 2026:

PlanPriceWhat you get
Free$0Limited daily Gemini 3.5 Flash usage
Google AI Pro$19.99/moHigher 3.5 Flash limits, Veo 3.1, NotebookLM Pro
Google AI Ultra$99.99/mo5x usage of Pro, priority Antigravity, early Gemini Spark, 20TB cloud, YouTube Premium
Google AI Ultra (top tier)$199.99/mo20x usage of Pro, expanded Project Genie access

The $250/mo tier was reduced to $200. The $100 tier is new.

Who should care

Agent developers — Flash’s 4x output speed and 1M-token context change agentic loop economics. If you’re hitting Opus or GPT-5.5 limits on cost or latency, Flash is the obvious comparison.

Anyone using AI Mode in Google Search — you’re already using 3.5 Flash, even if you didn’t notice. The new search box, Information Agents, and conversational refinement all run on it.

Cursor / Windsurf / Antigravity users — Flash is now the default in Antigravity and is the cheaper option in Cursor 3 and Windsurf model pickers.

Jules users — the free tier moves to 3.5 Flash, the Pro and Ultra tiers still use 3.1 Pro pending the 3.5 Pro launch in June.

What’s next

  • Gemini 3.5 Pro — June 2026 launch, currently in testing.
  • Gemini Omni Flash — rolling out alongside 3.5 Flash for multimodal video generation.
  • Gemini Spark — beta access for Ultra subscribers in the US next week.

TL;DR

Gemini 3.5 Flash is the new floor for Google’s models, not the ceiling. It’s a Flash-tier model that beats last-gen Pro on most agentic benchmarks, runs 4x faster, and is now the default everywhere — Gemini app, AI Mode in Search, Antigravity 2.0, Jules. The actual flagship (Gemini 3.5 Pro) drops in June. For now, Flash is the workhorse and it’s a strong one.