AI agents · OpenClaw · self-hosting · automation

Quick Answer

GLM-5.2 vs DeepSeek V4 Pro vs Kimi K2.7 Code: Open-Weight June 2026

Published:

GLM-5.2 vs DeepSeek V4 Pro vs Kimi K2.7 Code: Open-Weight June 2026

Three frontier open-weight models from three Chinese labs, released in three different June 2026 weeks. GLM-5.2 (Z.ai, June 16), DeepSeek V4 Pro (continuing 2026 leader), and Kimi K2.7 Code (Moonshot, June 12) are now the open-weight options that matter. Here is how they compare for production deployment.

Last verified: June 19, 2026.

TL;DR

  • GLM-5.2 — new #1 open-weight on Intelligence Index at 51. Best for long-horizon coding, 1M context.
  • DeepSeek V4 Pro — price-per-capability leader. Largest parameter count. Best for raw SWE workloads.
  • Kimi K2.7 Code — strongest MCP tool-use scores. Best for agentic MCP-heavy stacks.
  • Honest pick: Most teams will route across all three based on workload type.

Direct comparison

SpecGLM-5.2DeepSeek V4 ProKimi K2.7 Code
ReleaseJune 16, 2026 (open)Active 2026 leaderJune 12, 2026
LabZ.ai (Beijing)DeepSeek (Hangzhou)Moonshot AI (Beijing)
LicenseMITMITModified MIT
Total parameters753B1.6T1T
Active parameters40B~200B~30B
Context window1M128K-200K256K
Vision inputNoNoYes (MoonViT)
AA Intelligence Index51 (#1 open)44 (max)~43 (K2.6 baseline)
MCP AtlasNot reportedNot reported76.0
MCP Mark VerifiedNot reportedNot reported81.1
OpenRouter input/M$1.40<$1.00varies
OpenRouter output/M$4.40<$2.00varies
Direct API input/Mvaries$0.27-$0.55$0.95
Direct API output/Mvaries$1.10-$2.19$4.00
Self-host VRAM~half DeepSeekMulti-H100/H200~595GB weights
First-party workspaceNoNoKimi Code (kimi.com/code)

When GLM-5.2 wins

  • Long-horizon agentic coding. 1M context window plus optimization for multi-step engineering loops.
  • Top open-weight intelligence. 51 on Intelligence Index v4.1 is the highest open score.
  • Self-host cost efficiency. 40B active parameters fits on fewer GPUs than DeepSeek’s ~200B active.
  • Front-end coding. Ranked #2 on Code Arena WebDev despite no vision input — second only to Claude Fable 5.

When DeepSeek V4 Pro wins

  • Pure cost-per-capability. Still the lowest API pricing per task across most providers.
  • Standard SWE workloads. 1.6T parameter MoE has more raw computational headroom for the hardest single-turn coding problems.
  • Ecosystem maturity. Available on dozens of inference platforms globally with OpenAI-compatible APIs.
  • Community fine-tunes. MIT license plus existing community has produced the most derivative variants.

When Kimi K2.7 Code wins

  • MCP tool-heavy workflows. MCP Atlas 76.0 and MCP Mark Verified 81.1 are the open-weight ceiling.
  • You need vision input. MoonViT encoder for screenshots, diagrams, scientific figures.
  • First-party agentic workspace. Kimi Code (kimi.com/code) is the only Cursor-style web workspace among the three.
  • You self-host with limited GPUs. ~30B active means cheapest per-token inference at scale.

The architecture story

The three labs made three different bets on what frontier open-weight should look like:

  • Z.ai (GLM-5.2): Lean MoE, biggest context window. Bet on long-horizon agentic depth.
  • DeepSeek (V4 Pro): Maximum parameter count, lowest price. Bet on capability-per-dollar at the model layer.
  • Moonshot (K2.7 Code): Mid-size MoE, best MCP tool-use, first-party workspace. Bet on agentic ecosystem.

In June 2026, none of these bets is obviously right. All three are top-10 open-weight models. All three are MIT or near-MIT licensed. All three have multiple commercial inference providers.

Routing recommendations

WorkloadFirst choiceFallback
Long-horizon agentic codingGLM-5.2Kimi K2.7 Code
Cost-sensitive bulk SWEDeepSeek V4 ProGLM-5.2
MCP tool-heavy agentsKimi K2.7 CodeGLM-5.2
Vision-required codingKimi K2.7 Code(closed: Fable 5)
Web Composer-style UIKimi Code workspaceCursor + GLM-5.2
Self-host with limited VRAMKimi K2.7 Code (30B active)GLM-5.2 (40B active)
1M context requiredGLM-5.2Kimi K2.7 Code (256K)

Why this matters now

With Claude Fable 5’s free Pro/Max access ending June 22, 2026, and credit-based pricing kicking in June 23, every production team using Fable 5 inside Claude Code or Cursor is now doing the routing math. The three models above are the open-weight options that don’t lock you into a single closed-frontier vendor’s pricing schedule.

The pattern that is winning in late June 2026:

  1. Keep Claude Fable 5 (or Opus 4.8) as the quality ceiling for the hardest 10-20% of tasks.
  2. Route 60-80% of agentic coding bulk to one of these three open-weight models based on workload type.
  3. Use OpenRouter or a self-hosted vLLM/SGLang stack to abstract the routing.

The open-weight tier is no longer “good enough for cheap work.” It is “competitive on capability with closed frontier for everything except the absolute hardest tasks.” June 2026 is when that became true.