What Is GLM-5.2? Z.ai's Long-Horizon Coding Model Explained
What Is GLM-5.2? Z.ai’s Long-Horizon Coding Model Explained
Z.ai released GLM-5.2 to coding-plan subscribers on June 13, 2026 and dropped full open-weights under MIT on June 16. Within 24 hours of the open-weight release, Artificial Analysis ranked it #1 among all open-weight models on the Intelligence Index v4.1. Here is what GLM-5.2 actually is, why it matters, and how to use it.
Last verified: June 19, 2026.
TL;DR
- GLM-5.2 is Z.ai’s flagship language model — 753B total parameters, 40B active, MoE architecture.
- 1 million token context window (up from 200K in GLM-5.1).
- Text-only input. No vision encoder.
- MIT licensed. Fully open weights on Hugging Face.
- Released June 13 (subscribers) / June 16 (open weights), 2026.
- Currently #1 open-weight model on the AA Intelligence Index v4.1 at score 51.
- ~$1.40/$4.40 per M tokens via OpenRouter from 9 providers.
The headline numbers
| Metric | GLM-5.2 |
|---|---|
| Total parameters | 753B |
| Active parameters | 40B |
| Architecture | Mixture-of-Experts |
| Weights size | 1.51 TB |
| Context window | 1,000,000 tokens |
| Vision input | No |
| License | MIT |
| AA Intelligence Index v4.1 | 51 (#1 open-weight) |
| Code Arena WebDev rank | #2 (behind Claude Fable 5) |
| OpenRouter input price | ~$1.40 / M tokens |
| OpenRouter output price | ~$4.40 / M tokens |
| Output tokens per AA Index task | ~43k |
| Comparable closed-frontier price | $5/$25 (Opus) or $10/$50 (Fable 5) |
What Z.ai built it for
Z.ai positions GLM-5.2 explicitly as a long-horizon-task model. The pitch is that for multi-step agentic coding workflows — repo exploration, planning, parallel sub-task execution, verification — the model needs three things:
- Big context window. Long horizons mean large state. The 1M token window is 5x GLM-5.1’s window.
- Robust agentic capability. Z.ai claims significant gains on Code Arena WebDev (now ranked #2 behind only Claude Fable 5) and on coding benchmarks specifically tied to multi-step engineering.
- Open-weight reliability. MIT license means teams can self-host on infrastructure they control, satisfying sovereignty and data-residency requirements that block closed-frontier alternatives.
The trade-off is token consumption. AA measured GLM-5.2 using ~43k output tokens per Intelligence Index task, materially more than MiniMax-M3 (24k), Kimi K2.6 (35k), or even DeepSeek V4 Pro max (37k). The model “thinks more out loud” — which contributes to its strong scores but adds to per-task cost.
Where GLM-5.2 sits on the benchmarks
AA Intelligence Index v4.1:
- GLM-5.2: 51 (highest open-weight score recorded)
- MiniMax-M3: 44
- DeepSeek V4 Pro (max): 44
- Kimi K2.6: 43
- Claude Fable 5: 64.9 (closed-frontier leader)
- GPT-5.5: ~60 (closed)
Code Arena WebDev: #2 overall, behind only Claude Fable 5. Notable because GLM-5.2 has no vision input, and front-end web development is typically a vision-advantaged task (screenshots, design references, visual debugging).
SVG generation: Anecdotal but cited — Simon Willison’s “pelican riding a bicycle” SVG benchmark showed strong results, consistent with GLM-5.1’s strength.
How to access GLM-5.2
| Path | Best for | Pricing |
|---|---|---|
| OpenRouter | Quick integration into existing OpenAI-compatible stacks | ~$1.40/$4.40 per M |
| Z.ai coding plan | Direct subscription with premium features | Subscription |
| Hugging Face weights | Self-host, fine-tune, sovereign deployment | Inference cost only |
| vLLM / SGLang | High-throughput self-hosted inference | Inference cost only |
The 1.51 TB weight size means self-hosting requires multi-H100 or H200 territory. For most teams, OpenRouter is the practical entry point.
How GLM-5.2 changes the routing math
Before GLM-5.2, the highest-quality open-weight model was meaningfully behind closed-frontier on most benchmarks. Teams routing agentic coding workloads typically used closed-frontier models (Claude Opus 4.8, GPT-5.5) for the bulk of work and reserved smaller open-weight models for cost-sensitive long-tail tasks.
GLM-5.2 narrows that gap. The Intelligence Index gap to Claude Fable 5 is roughly 14 points (51 vs 64.9). For everything except the hardest 10-20% of tasks, GLM-5.2 is now competitive. At 4-7x lower cost. With self-hostable weights.
For production teams, the practical impact:
- The “default” tier for agentic coding shifts from closed-frontier to GLM-5.2 for most workloads.
- Claude Fable 5 (or Opus 4.8) becomes the quality-ceiling fallback for the hardest tasks.
- Cost-per-task drops substantially for any team that adopts the routing pattern.
- Sovereignty-required deployments now have a credible top-tier option.
What GLM-5.2 is not
- Not multimodal. No vision input. For screenshot-to-code, scientific figure analysis, or visual debugging, you still need Fable 5, Opus 4.8, GPT-5.5, or Kimi K2.7 Code.
- Not the smartest model in the world. Claude Fable 5 still leads the Intelligence Index at 64.9.
- Not the cheapest per task. DeepSeek V4 Pro is still slightly cheaper for many workloads given GLM-5.2’s high output-token consumption.
- Not new architecture territory. It is a refined MoE, not a paradigm shift.
The honest read
GLM-5.2 is the most important open-weight release of June 2026 — possibly of the year so far. The combination of #1-open-weight Intelligence Index score, 1M context window, MIT license, and 4-7x cost advantage over closed-frontier means it now belongs in every production routing stack that involves agentic coding work.
It is not a replacement for Claude Fable 5. It is the new default for 60-80% of the work that used to require Fable 5 — and that is the meaningful cost shift for any team running agentic coding at scale in late June 2026.