What is DeepSeek V4-Pro? Specs, Benchmarks, Price (April 2026)
What is DeepSeek V4-Pro? Specs, Benchmarks, Price (April 2026)
DeepSeek V4-Pro launched on April 24, 2026 — and it’s the most consequential AI release of the year so far. Here’s a complete primer on what it is, what it can do, and why it matters, as of April 25, 2026.
Last verified: April 25, 2026
The 30-second summary
- Maker: DeepSeek (Hangzhou, China)
- Released: April 24, 2026 (preview)
- Type: Mixture-of-Experts large language model
- Total parameters: 1.6 trillion
- Active parameters per token: 49 billion
- Context window: 1,048,576 tokens (1M)
- SWE-bench Verified: 80.6%
- Pricing: $1.74 input / $3.48 output per million tokens
- License: DeepSeek custom open-weight license (commercial use OK)
- Weights available on: Hugging Face (deepseek-ai/DeepSeek-V4-Pro)
V4-Pro is the first open-weight model that lands within statistical noise of Claude Opus 4.7 and GPT-5.5 on coding benchmarks — at roughly one-seventh the price.
Architecture
V4-Pro is a Mixture-of-Experts (MoE) transformer with substantial improvements over V3:
- Total experts: Undisclosed, but inferred ~256 experts
- Active per token: 49B parameters route through (~3% of total)
- Attention: Multi-head Latent Attention (MLA), DeepSeek’s compression technique that cuts KV cache memory by ~90%
- Position encoding: YaRN-extended for the full 1M context
- Training tokens: Estimated 18-22 trillion tokens (DeepSeek hasn’t published exact figures)
- Training compute: Mix of Nvidia H800/H200 and Huawei Ascend 950 chips
V4-Pro uses the same MoE-with-MLA pattern as V3 but scales the parameter count from ~671B to 1.6T, allowing more specialized experts and higher-quality routing.
Benchmarks
DeepSeek-reported and partial third-party verification:
| Benchmark | V4-Pro | Claude Opus 4.7 | GPT-5.5 |
|---|---|---|---|
| MMLU-Pro | 83.2% | 84.1% | 84.8% |
| GPQA Diamond | 78.6% | 79.4% | 80.1% |
| SWE-bench Verified | 80.6% | 80.8% | 76.4% |
| Terminal-Bench 2.0 | 67.9% | 65.4% | 82.7% |
| LiveCodeBench | 93.5% | 88.8% | 91.2% |
| AIME 2026 (math) | 88.4% | 89.1% | 87.4% |
| τ²-Bench (agents) | 71.4% | 76.2% | 78.9% |
| Aider Polyglot | 79.8% | 81.2% | 75.4% |
Key takeaway: V4-Pro is at the frontier on coding-heavy benchmarks. It trails on agentic benchmarks (where GPT-5.5’s Dynamic Reasoning Time and computer use shine) and is roughly tied on knowledge benchmarks.
Pricing
| V4-Pro | V4-Flash | Claude Opus 4.7 | GPT-5.5 | |
|---|---|---|---|---|
| Input ($/M) | $1.74 | $0.14 | $5.00 | $5.00 |
| Output ($/M) | $3.48 | $0.28 | $25.00 | $30.00 |
| Combined | $5.22 | $0.42 | $30.00 | $35.00 |
DeepSeek pricing is 6-7× cheaper than the closed frontier on output tokens — where most real-world cost lives. V4-Flash is 70-100× cheaper than the closed frontier.
Where to use V4-Pro
Direct API
DeepSeek’s first-party API at api.deepseek.com — cheapest, but China-hosted (data residency concerns for some).
US/EU-hosted resellers
- OpenRouter — easiest router integration
- Together AI — strong throughput
- Fireworks AI — best tool-calling
- DeepInfra — competitive pricing
- Hyperbolic — useful for batch
Self-hosted
- vLLM on Nvidia (16× H200 minimum for V4-Pro full quality)
- vLLM-Ascend on Huawei Ascend 950 supernodes
- MLX on Apple Silicon (V4-Flash only — V4-Pro is too big)
What’s the catch?
A few honest caveats:
-
Custom license, not OSI-compliant. Apache 2.0 / MIT models like Llama 5 and Kimi K2.6 are stricter “open source.” V4-Pro is open-weight with restrictions.
-
Tool-calling reliability is ~5 points behind Claude Opus 4.7 on third-party benchmarks. Most tool calls work; some need retry logic.
-
No native multimodal. V4 is text-only. For vision/audio you still need Gemini 3.1 Pro or GPT-5.5.
-
No native computer use. GPT-5.5 still owns this category.
-
Safety calibration is improved over V3 but has rougher edges than Anthropic models, especially around politically sensitive topics.
-
First-party API is China-hosted. Compliance-sensitive teams should use US/EU resellers or self-host.
Why it matters
The DeepSeek R1 launch in January 2025 wiped $600B off Nvidia’s market cap in 24 hours and proved frontier reasoning could be done cheaply. The V4-Pro launch is the second compression event — but this time targeting general intelligence, agentic coding, and 1M context.
Three knock-on effects to watch:
- Anthropic and OpenAI pricing pressure. Expect Sonnet 4.7 / 4.8 and GPT-5.5-mini repricing within 6-8 weeks.
- Open-weight ecosystem acceleration. Kimi K2.6, GLM-5.1, Llama 5 will all push harder.
- Routing wins everything. Teams using LiteLLM, OpenRouter, Helicone-style routers capture the savings. Teams locked to one provider don’t.
Is V4-Pro the new default?
For text-heavy, cost-sensitive workloads: yes.
For ultra-critical PRs, computer-use agents, or multimodal apps: no — keep your existing tooling, but add V4-Pro as a router target.
The smart play in late April 2026 isn’t “switch to V4-Pro.” It’s “stop being single-vendor.”
Where to learn more
- Official: https://api-docs.deepseek.com/news/news260424
- Hugging Face model card: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
- Simon Willison’s analysis: https://simonwillison.net/2026/apr/24/deepseek-v4/
- VentureBeat coverage: https://venturebeat.com/technology/deepseek-v4-arrives-with-near-state-of-the-art-intelligence-at-1-6th-the-cost-of-opus-4-7-gpt-5-5
Last verified: April 25, 2026. Sources: DeepSeek API docs, Hugging Face deepseek-ai/DeepSeek-V4-Pro model card, Simon Willison, VentureBeat, TechCrunch, Reuters, Fortune.