What's new in Claude Opus 4.8 vs Opus 4.7?

Opus 4.8 (released May 28, 2026, 41 days after Opus 4.7) brings four headline changes. (1) Benchmarks: SWE-bench Verified 88.6% (up from 87.6%), Terminal-Bench 2.1 74.6%, GPQA Diamond 93.6%, GDPval-AA Elo 1890 (+121 over GPT-5.5), Online-Mind2Web 84%. (2) Honesty: substantially lower misaligned behavior on internal evals across military-grade weapons content, harmful sexual content, disallowed cyberoffense, and undermining democratic institutions. (3) Fast mode: 2.5x faster generation at significantly reduced rates. (4) Dynamic workflows in research preview, capped at 16 concurrent / 1,000 total subagents per run.

Should I upgrade from Opus 4.7 to Opus 4.8?

For most use cases, yes — the upgrade is mechanically simple (change model ID) and the gains compound. Production AI coding agents see meaningful improvement on hard refactors. Customer-facing chatbots benefit from the honesty improvements. Cost is the same on standard tier ($5/$25 per million tokens), and fast mode actually costs less than Opus 4.7 fast mode. The main reason NOT to upgrade immediately: if you depend on a specific Opus 4.7 behavior in production (a particular failure mode you've already prompt-engineered around), do an A/B before committing.

Did Opus 4.8 raise prices?

No. Opus 4.8 standard tier is priced at the same $5 per million input tokens and $25 per million output tokens as Opus 4.7. Fast mode pricing was reduced — Anthropic says 2.5x faster generation at significantly reduced rates compared to Opus 4.7 fast mode. The exact fast mode per-token rate is shown in your Anthropic console; reporting from Fortune confirms it's lower than Opus 4.7 fast mode.

What are Opus 4.8 dynamic workflows?

Dynamic workflows are a research-preview feature in Claude Code, launched alongside Opus 4.8 on May 28, 2026. They let Claude write orchestration scripts that spawn tens to hundreds of parallel subagents in a single session, with iterative verification and resumable state. Cap: 16 concurrent subagents, 1,000 total per run. Use case: full codebase migrations, large refactors, multi-file investigations — work that doesn't fit in one Claude Code session but is still one logical task. Available in Claude Code via the dynamic workflow primitives.

Quick Answer

Claude Opus 4.8 vs Opus 4.7: Should You Upgrade? (June 2026)

Published: June 1, 2026

Claude Opus 4.8 vs Opus 4.7: Should You Upgrade? (June 2026)

Anthropic shipped Opus 4.8 on May 28, 2026 — exactly 41 days after Opus 4.7. That’s a fast cadence by frontier-model standards, and the changes are real, not cosmetic. Here’s the head-to-head and the upgrade decision.

Last verified: June 1, 2026.

TL;DR

	Opus 4.8	Opus 4.7
Released	May 28, 2026	April 17, 2026
SWE-bench Verified	88.6%	87.6%
Terminal-Bench 2.1	74.6%	~72%
GPQA Diamond	93.6%	~92%
GDPval-AA Elo	1890 (+121 over GPT-5.5)	~1820
Online-Mind2Web	84%	lower
Honesty / alignment	Materially better	Baseline
Standard pricing	$5 / $25 per million tokens	$5 / $25 per million tokens
Fast mode	2.5x faster, reduced rate	Slower, higher rate
Dynamic workflows	Research preview (16 concurrent, 1,000 total subagents)	Not available

The benchmark gains, in context

Most version-over-version frontier model jumps in 2025–2026 were 0.5–2 percentage points on SWE-bench. Opus 4.8’s +1.0% on SWE-bench Verified (87.6% → 88.6%) is modest but real. The bigger story is:

Terminal-Bench 2.1 at 74.6% — meaningfully ahead of GPT-5.5’s 82.7% on Terminal-Bench (note: different versions; 4.8 is on the harder 2.1 benchmark)
GDPval-AA Elo 1890 — 121 Elo points ahead of GPT-5.5
Online-Mind2Web 84% — a meaningful jump for browser agents, per Browserbase’s Miguel Gonzalez

The Online-Mind2Web number is the one to watch. Browser-agent reliability is one of the hardest things to ship in production AI. Opus 4.8 at 84% means Claude can now drive web browsers for serious automation tasks where Opus 4.7 used to get stuck.

The honesty story

The most-discussed change in Opus 4.8 isn’t the SWE-bench delta — it’s the alignment improvement. Per the launch announcement and Medium analysis from Data Science Collective:

“On internal misalignment categories — military-grade weapons content, harmful sexual content, disallowed cyberoffense, undermining democratic institutions — Opus 4.8 scores markedly better than both Opus 4.7 and Sonnet 4.6.”

This matters in two ways:

Reduced false confidence. Opus 4.8 is better at saying “I don’t know” or “I’m uncertain.” On long agent sessions where confidently-wrong answers compound, this is a quality-of-life win that compounds.
Enterprise risk reduction. For regulated buyers (finance, healthcare, government), alignment improvements are a procurement signal. Opus 4.8 is easier to ship internally than Opus 4.7 was.

Dynamic workflows (research preview)

Launched alongside Opus 4.8, dynamic workflows let Claude Code spin up tens to hundreds of parallel subagents in a single session.

The caps:

16 concurrent subagents at any time
1,000 total subagents over a single run

Use cases Anthropic and early users have flagged:

Full codebase migrations (React class components → hooks across 800 files; CommonJS → ESM)
Large multi-file refactors
Cross-codebase consistency sweeps (auth pattern updates, error-handling refactors)
Investigation work (find all instances of an anti-pattern, propose fixes, generate PRs)

This is the feature that competes directly with Cognition Devin’s autonomous-engineer pitch — but inside Claude Code, integrated with the developer’s local environment.

Fast mode pricing change

Standard Opus 4.8 keeps the same $5 input / $25 output per-million-token pricing as Opus 4.7. No surprise there — Anthropic explicitly said pricing is unchanged for standard.

Fast mode is where Opus 4.8 actually got cheaper. Fortune’s reporting confirms: 2.5x faster generation than Opus 4.7 fast mode at significantly reduced per-token rates. Exact rate is shown in the Anthropic console, but the practical implication: fast mode is now the default sensible choice for the 80% of production work where you don’t need the absolute best reasoning chain.

(See our separate comparison: Opus 4.8 Fast Mode vs GPT-5.5 vs Gemini 3.5 Flash for the cost-routing math.)

When NOT to upgrade

For 95% of teams, the upgrade is a no-brainer. But two scenarios where you should wait:

1. You’ve prompt-engineered against a specific Opus 4.7 quirk. If your production pipeline relies on a known failure mode of Opus 4.7 (you’ve added a workaround for it), upgrading might break that workaround. Run an A/B on a representative slice before flipping prod.

2. You’re mid-migration to dynamic workflows. Dynamic workflows are in research preview. The API surface may change. If you’re building production tooling that depends on the preview shape, expect breakage in subsequent releases. Build with that in mind.

Migration checklist

If you decide to upgrade:

Update model ID in your API calls (claude-opus-4-8 or whatever your provider exposes)
Re-run your eval suite — your numbers will improve on most metrics, but verify
Re-tune any temperature / top_p settings if you’ve over-fit them to Opus 4.7
Watch your alignment-sensitive use cases — Opus 4.8 may refuse some prompts Opus 4.7 accepted (good for safety, may surprise you)
Test fast mode on a non-critical workload to see if you can drop tier on the bulk of your traffic
Pilot dynamic workflows on one large refactor to learn the orchestration model before committing

How Opus 4.8 fits the broader landscape (June 2026)

Opus 4.8 is now the SWE-bench Verified leader (88.6%) and the GDPval Elo leader (1890)
GPT-5.5 still leads on Terminal-Bench (82.7%) and 1M-context retrieval; GPT-5.6 expected by June 30, 2026
Gemini 3.5 Flash owns cost ($1.50/$9) and 2M context; Gemini 3.5 Pro launches June 2026
DeepSeek V4 Pro is the cheapest frontier-tier model after its 75% price cut (May 23, 2026)

For coding-heavy workloads, Opus 4.8 standard is the highest-quality option in June 2026. For balanced cost/quality, Opus 4.8 Fast Mode. For cost-leader, Gemini 3.5 Flash or DeepSeek V4 Pro.

Sources

Anthropic: Introducing Claude Opus 4.8 (May 28, 2026) — official announcement
LLM Stats: Claude Opus 4.8 Release, Benchmarks — full benchmark breakdown
Vellum: Claude Opus 4.8 Benchmarks Explained
MarkTechPost: Anthropic Ships Opus 4.8 + Dynamic Workflows
TechCrunch: Opus 4.8 with new dynamic workflow tool

Bottom line

Opus 4.8 is a clean, no-drama upgrade for most teams running Opus 4.7 in production. Better benchmarks, better alignment, cheaper fast mode, and dynamic workflows on top. Same standard pricing. Flip the model ID, re-run your evals, and move on.