What Is GPT-5.4 Mini? OpenAI's New Budget Powerhouse
What Is GPT-5.4 Mini?
GPT-5.4 mini is OpenAI’s latest small model, released on March 17, 2026 as part of the GPT-5.4 family. It delivers near-flagship reasoning and coding performance at a fraction of the cost, making it one of the most compelling budget AI models available today.
Priced at $0.15 per million input tokens and $0.60 per million output tokens, GPT-5.4 mini is designed for developers who need strong AI capabilities without flagship model pricing. It’s more than 2x faster than its predecessor GPT-5 mini, while significantly outperforming it across every major benchmark.
Benchmark Performance
GPT-5.4 mini punches well above its weight class on every major evaluation:
| Benchmark | GPT-5.4 Mini | GPT-5 Mini | Improvement |
|---|---|---|---|
| SWE-bench Pro | 54.38% | 45.69% | +8.69 pts |
| Terminal-Bench 2.0 | 60.0% | 38.20% | +21.8 pts |
| GPQA Diamond | 88.01% | — | Near GPT-5.4’s 93% |
| OSWorld-Verified | 72.13% | 42.0% | +30.13 pts |
| Speed | >2x faster | Baseline | >2x |
The most striking improvement is on Terminal-Bench 2.0, where GPT-5.4 mini scores 60% compared to GPT-5 mini’s 38.20% — a 57% relative improvement. This benchmark measures real-world coding agent performance in terminal environments, making it highly relevant for developer tooling.
On GPQA Diamond, which tests graduate-level scientific reasoning, GPT-5.4 mini hits 88.01% — remarkably close to the full GPT-5.4 model’s 93%. This near-flagship performance at budget pricing is what makes the model compelling.
Key Use Cases
GPT-5.4 mini targets several high-value applications:
Coding assistants — With 54.38% on SWE-bench Pro, it handles complex software engineering tasks including bug fixes, feature implementation, and code review. The 60% Terminal-Bench 2.0 score confirms strong real-world coding performance.
AI subagents — Its speed and cost make it ideal as a worker model in multi-agent architectures. A supervisor agent can delegate specialized subtasks to GPT-5.4 mini without burning through compute budgets.
Computer use — The 72.13% OSWorld-Verified score (up from GPT-5 mini’s 42%) makes it viable for GUI automation, web browsing, and desktop interaction tasks.
Multimodal applications — Strong reasoning combined with fast inference makes it suitable for applications that process images, documents, or mixed-media inputs at scale.
Who’s Using It?
Early adopters include Hebbia, which uses GPT-5.4 mini for document analysis and knowledge work, and Notion, which integrates it for AI-powered features across their productivity platform. Both companies benefit from the model’s balance of capability and cost efficiency.
How It Fits the GPT-5.4 Family
The GPT-5.4 family offers a clear tiering:
- GPT-5.4 — Full flagship model with 93% GPQA Diamond
- GPT-5.4 Thinking — Reasoning-optimized variant
- GPT-5.4 mini — Near-flagship performance at budget pricing
- GPT-5.4 nano — Smallest and fastest for high-volume simple tasks
GPT-5.4 mini occupies the sweet spot: capable enough for complex work, cheap enough for high-volume usage. For most production workloads that don’t require absolute peak performance, it’s the default choice.