How much did GPT-5.5 raise prices vs GPT-5.4?

GPT-5.5 launched on April 23, 2026 at $5.00/M input and $30.00/M output. GPT-5.4 was $4.00/M input and $25.00/M output. That's a 25% bump on input, 20% bump on output, and a 0% change on cached input ($0.50/M for both). The pricing landed one day before DeepSeek V4-Pro launched at $1.74/$3.48 — making the gap between OpenAI and the open-weight frontier ~7x on output.

Should I migrate off GPT-5.5 because of the price hike?

Partially yes. Anything coding-shaped, RAG-shaped, or batch-shaped should move to DeepSeek V4-Pro at 7x lower cost — quality is at parity for those tasks. Anything long autonomous (>2 hr agent runs), Computer Use, or Realtime voice should stay on GPT-5.5 where it still leads. Realistic blended savings for a typical AI app: 50-70% of API spend with no perceptible quality drop.

Is GPT-5.5 actually better than GPT-5.4 enough to justify the price?

On benchmarks, yes — Terminal-Bench 2.0 jumped from 76.4% to 82.7%, SWE-bench Verified from 72% to 76.4%, AIME 2025 from 91% to 94%. For long autonomous coding agents and computer-use workflows, GPT-5.5 is meaningfully better. For chat, summarization, RAG, or anything bounded, the upgrade is marginal and not worth the price increase — those workloads should be on V4-Pro.

What about GPT-5.5-mini? Is that the answer?

GPT-5.5-mini is rumored for May 2026 at roughly $1/M input and $8/M output. If it ships at those prices, it's competitive with V4-Pro on cost (slightly more expensive) and has OpenAI's compliance, Computer Use, and Realtime support. For OpenAI-committed teams it'll be the right pick for default traffic. For everyone else, V4-Pro is still cheaper.

How fast can I actually migrate off GPT-5.5?

If you're on the OpenAI SDK: 1-2 days for a basic migration to OpenRouter or DeepSeek's OpenAI-compatible endpoint, plus eval-driven validation that takes a week. Tools to speed this up: Promptfoo for parallel testing, LiteLLM as a router, OpenRouter for unified billing. The technical work is small; the eval work is what takes time. Most teams should not 'big bang' migrate — split traffic 10/90, then 50/50.

Quick Answer

GPT-5.5 Pricing Shock: What To Do (April 2026 Action Guide)

Published: April 28, 2026

GPT-5.5 Pricing Shock: What To Do (April 2026 Action Guide)

OpenAI raised GPT-5.5 to $5/$30 per million tokens on April 23. DeepSeek V4-Pro launched the next day at $1.74/$3.48. Most teams that didn’t already have a multi-model strategy now urgently need one. Here’s the playbook.

Last verified: April 28, 2026

What actually happened

April 23: OpenAI announces GPT-5.5. Pricing: $5.00/M input, $30.00/M output, $0.50/M cached input. April 24: DeepSeek V4 launches. V4-Pro pricing: $1.74/M input, $3.48/M output, ~$0.0036/M cached input. April 25-28: Anthropic, Google, OpenRouter all see usage shifts. Cursor, Windsurf, OpenCode race to integrate V4.

The price gap on output tokens is ~8.6x. On cached input it’s ~140x. This is not a small adjustment — it’s a structural break in the API economy.

Why OpenAI raised prices

Three plausible reasons (none confirmed):

Inference cost. GPT-5.5 is a bigger model with longer reasoning chains. Per-token compute genuinely costs more.
Margin defense. Enterprise contracts have switching cost. OpenAI is choosing to extract more from sticky customers while ceding price-sensitive devs.
Anchoring for GPT-5.5-mini. A $30/M ceiling makes a $8/M GPT-5.5-mini look cheap. Classic Apple-style price laddering.

Whatever the reason, the practical answer is the same: route around it.

The 30-day migration plan

Week 1: Audit

Pull your last 30 days of OpenAI usage. Group by endpoint and prompt template.
Categorize each call:
- Bounded (chat, summarization, RAG, single-step coding) → migrate candidate
- Long autonomous (agent runs >2 hr, Computer Use, Realtime) → keep on GPT-5.5
- Multimodal (vision, image gen) → consider Gemini 3.1 Pro or stay
Identify your top-3 spend categories. Migration ROI concentrates there.

Week 2: Build the router

Pick one of:

OpenRouter — easiest, unified billing, but adds ~10% markup.
LiteLLM — self-hosted router, OpenAI-compatible, free.
Portkey — managed router, observability built in.
DIY — a simple if/else in your code, point at api.deepseek.com for V4 traffic.

Add eval coverage:

Promptfoo — parallel A/B test prompts across models with same eval set.
Inspect (UK AISI) — for safety-leaning evals.
Phoenix (Arize) or Langfuse — for production observability.

Week 3: 10/90 split

Route 10% of bounded traffic to V4-Pro. Watch:

LLM-judge accuracy on a static eval set.
User-facing metrics (CSAT, task completion).
Latency (V4-Pro ~10% slower than GPT-5.5 on first token, but cheaper input cache helps).
Cost.

Week 4: 50/50 or full cutover

If quality holds (it usually does for bounded tasks):

Move bounded traffic to 100% V4-Pro.
Keep long autonomous and Computer Use on GPT-5.5.
Reserve Opus 4.7 for hardest tasks where quality matters more than cost.

Realistic outcome: 50-70% reduction in API spend, with no user-perceptible quality drop on bounded workloads.

Concrete migrations by workload type

Workload	Was	Move to	Expected savings
Customer support RAG	GPT-5.5	V4-Pro	80%
Code review bot	GPT-5.5	V4-Pro or Sonnet 4.6	75%
Document summarization	GPT-5.5	V4-Flash	95%
Bulk classification	GPT-5.5	V4-Flash	95%
Chatbot	GPT-5.5	V4-Pro	80%
Long autonomous agent	GPT-5.5	Stay (or Opus 4.7)	0%
Computer Use	GPT-5.5	Stay	0%
Realtime voice	GPT-5.5 Realtime	Stay	0%
Vision-heavy multimodal	GPT-5.5	Gemini 3.1 Pro	60%
Hardest reasoning	GPT-5.5 xhigh	Opus 4.7	varies

Caching: the secret weapon

DeepSeek V4-Pro caches input tokens at roughly $0.0036 per million — basically free. GPT-5.5 caches at $0.50/M. That’s a 140x gap.

If your agent has a large stable system prompt (think 30K-token tool definitions), V4-Pro effectively zeros out your input cost on cache hits. This alone can flip the economics for high-volume agent workloads.

To get the cache hits:

Keep system prompt prefix stable across calls.
Don’t insert variable content (timestamps, request IDs) at the top.
Put dynamic content at the end.

What about Anthropic?

Claude Opus 4.7 is at $5/$25 — cheaper than GPT-5.5 on output. Sonnet 4.6 is at $3/$15. For Anthropic-committed teams, the migration path is “stay on Anthropic, use Sonnet 4.6 by default, escalate to Opus 4.7.”

Anthropic also has Claude Code’s flat-rate pricing tier ($200/mo Pro), which is the right answer for heavy individual developer use cases — uncapped Sonnet 4.6 + budgeted Opus 4.7.

What if I’m locked into OpenAI?

Common reasons:

Existing DPAs / compliance — defensible, hard to switch.
Computer Use — only OpenAI has the CUA-trained model that works reliably.
Realtime API — sub-300ms voice loop, no equivalent elsewhere.
Agents SDK — built around hosted state on OpenAI’s side.

For these workloads, your options are:

Cache aggressively. OpenAI cache is $0.50/M, 10x cheaper than fresh. Use it.
Use GPT-5.5-mini when it ships. Likely $1/$8.
Negotiate. $30/M is rack rate. Enterprise contracts at >$100K/year often see 30-50% discounts.
Pre-process with V4-Pro. Use V4-Pro to filter / summarize / rewrite, then send only the trimmed prompt to GPT-5.5. Cuts GPT-5.5 token volume by 50-70%.

What’s coming next 30 days

GPT-5.5-mini — likely May, $1/$8 rumored.
Anthropic Sonnet 4.7 — could undercut GPT-5.5 on quality-per-dollar.
DeepSeek V4-Reasoning — extended thinking, targeting GPT-5.5 xhigh.
OpenAI volume discounts — likely tightening for >$50K/mo accounts.

TL;DR

Audit your spend. Most teams have 60-80% of OpenAI calls on bounded workloads.
Route bounded traffic to V4-Pro via OpenRouter / LiteLLM. Expect 70-85% cost savings.
Keep GPT-5.5 for long autonomous agents, Computer Use, Realtime voice.
Cache aggressively on whichever provider you stay with.
Wait for GPT-5.5-mini before deciding on full cutover if you’re committed to OpenAI.

Don’t panic. Don’t big-bang migrate. Run both, measure, save 50-70%.

Last verified: April 28, 2026. Sources: OpenAI GPT-5.5 announcement (April 23, 2026), DeepSeek V4 release notes (April 24, 2026), Anthropic pricing page, Artificial Analysis benchmarks, OpenRouter pricing data.