GLM-5.2 is Z.ai's seventh-generation flagship language model, released to coding-plan subscribers on June 13, 2026 and made open-weight under an MIT license on June 16, 2026. It is a 753B-parameter Mixture-of-Experts (MoE) model with 40B active parameters per token, a 1 million token context window, and text-only input (no vision). Z.ai pitches it as their best model for long-horizon tasks — multi-step agentic coding, deep reasoning over large codebases, and complex multi-turn workflows. As of June 17, 2026, GLM-5.2 ranks #1 among open-weight models on the Artificial Analysis Intelligence Index v4.1 with a score of 51, ahead of MiniMax-M3, DeepSeek V4 Pro, and Kimi K2.6.

Why is GLM-5.2 a big deal?

Four reasons. First, it is the highest-scoring open-weight model on the leading independent intelligence benchmark — open-weight just caught up to a meaningful fraction of closed-frontier capability. Second, it is MIT-licensed, meaning anyone can self-host, fine-tune, or commercially deploy it without restrictive licensing. Third, the 1M token context window is the longest in any open-weight model, enabling long-horizon agentic engineering workflows that previously required closed-frontier models. Fourth, it is cheap: ~$1.40/$4.40 per million input/output tokens via OpenRouter, roughly 4-7x cheaper than GPT-5.5 or Claude Fable 5. For teams routing agentic coding workloads, GLM-5.2 changes the cost-quality frontier.

How does GLM-5.2 compare to GLM-5.1?

GLM-5.2 is a substantial upgrade. The context window jumped from 200K to 1M tokens. The Intelligence Index score rose from GLM-5.1's mid-40s to 51, putting it ahead of every other open-weight model on the benchmark. Z.ai also reports significant gains on long-horizon coding tasks specifically, which is the headline use case for the new release. The trade-off: GLM-5.2 is token-hungry — Artificial Analysis measured roughly 43k output tokens per Intelligence Index task vs 26k for GLM-5.1. For total cost of ownership, plan for higher per-task token consumption even though per-token pricing is lower.

Where can I try GLM-5.2?

Three main paths. (1) OpenRouter at openrouter.ai/z-ai/glm-5.2 — available from 9 providers including Together, Hyperbolic, DeepInfra, Fireworks, and others, with OpenAI-compatible APIs and pricing around $1.40/$4.40 per million tokens. (2) Z.ai's own coding plan subscription at z.ai — direct API access including premium features. (3) Self-hosted via Hugging Face at huggingface.co/zai-org/GLM-5.2 — the full 1.51TB model weights are MIT-licensed and run on vLLM, SGLang, or KTransformers. For most teams, OpenRouter is the fastest path. Self-hosting is the cheapest at high volume.

Quick Answer

What Is GLM-5.2? Z.ai's Long-Horizon Coding Model Explained

Published: June 19, 2026

What Is GLM-5.2? Z.ai’s Long-Horizon Coding Model Explained

Z.ai released GLM-5.2 to coding-plan subscribers on June 13, 2026 and dropped full open-weights under MIT on June 16. Within 24 hours of the open-weight release, Artificial Analysis ranked it #1 among all open-weight models on the Intelligence Index v4.1. Here is what GLM-5.2 actually is, why it matters, and how to use it.

Last verified: June 19, 2026.

TL;DR

GLM-5.2 is Z.ai’s flagship language model — 753B total parameters, 40B active, MoE architecture.
1 million token context window (up from 200K in GLM-5.1).
Text-only input. No vision encoder.
MIT licensed. Fully open weights on Hugging Face.
Released June 13 (subscribers) / June 16 (open weights), 2026.
Currently #1 open-weight model on the AA Intelligence Index v4.1 at score 51.
~$1.40/$4.40 per M tokens via OpenRouter from 9 providers.

The headline numbers

Metric	GLM-5.2
Total parameters	753B
Active parameters	40B
Architecture	Mixture-of-Experts
Weights size	1.51 TB
Context window	1,000,000 tokens
Vision input	No
License	MIT
AA Intelligence Index v4.1	51 (#1 open-weight)
Code Arena WebDev rank	#2 (behind Claude Fable 5)
OpenRouter input price	~$1.40 / M tokens
OpenRouter output price	~$4.40 / M tokens
Output tokens per AA Index task	~43k
Comparable closed-frontier price	$5/$25 (Opus) or $10/$50 (Fable 5)

What Z.ai built it for

Z.ai positions GLM-5.2 explicitly as a long-horizon-task model. The pitch is that for multi-step agentic coding workflows — repo exploration, planning, parallel sub-task execution, verification — the model needs three things:

Big context window. Long horizons mean large state. The 1M token window is 5x GLM-5.1’s window.
Robust agentic capability. Z.ai claims significant gains on Code Arena WebDev (now ranked #2 behind only Claude Fable 5) and on coding benchmarks specifically tied to multi-step engineering.
Open-weight reliability. MIT license means teams can self-host on infrastructure they control, satisfying sovereignty and data-residency requirements that block closed-frontier alternatives.

The trade-off is token consumption. AA measured GLM-5.2 using ~43k output tokens per Intelligence Index task, materially more than MiniMax-M3 (24k), Kimi K2.6 (35k), or even DeepSeek V4 Pro max (37k). The model “thinks more out loud” — which contributes to its strong scores but adds to per-task cost.

Where GLM-5.2 sits on the benchmarks

AA Intelligence Index v4.1:

GLM-5.2: 51 (highest open-weight score recorded)
MiniMax-M3: 44
DeepSeek V4 Pro (max): 44
Kimi K2.6: 43
Claude Fable 5: 64.9 (closed-frontier leader)
GPT-5.5: ~60 (closed)

Code Arena WebDev: #2 overall, behind only Claude Fable 5. Notable because GLM-5.2 has no vision input, and front-end web development is typically a vision-advantaged task (screenshots, design references, visual debugging).

SVG generation: Anecdotal but cited — Simon Willison’s “pelican riding a bicycle” SVG benchmark showed strong results, consistent with GLM-5.1’s strength.

How to access GLM-5.2

Path	Best for	Pricing
OpenRouter	Quick integration into existing OpenAI-compatible stacks	~$1.40/$4.40 per M
Z.ai coding plan	Direct subscription with premium features	Subscription
Hugging Face weights	Self-host, fine-tune, sovereign deployment	Inference cost only
vLLM / SGLang	High-throughput self-hosted inference	Inference cost only

The 1.51 TB weight size means self-hosting requires multi-H100 or H200 territory. For most teams, OpenRouter is the practical entry point.

How GLM-5.2 changes the routing math

Before GLM-5.2, the highest-quality open-weight model was meaningfully behind closed-frontier on most benchmarks. Teams routing agentic coding workloads typically used closed-frontier models (Claude Opus 4.8, GPT-5.5) for the bulk of work and reserved smaller open-weight models for cost-sensitive long-tail tasks.

GLM-5.2 narrows that gap. The Intelligence Index gap to Claude Fable 5 is roughly 14 points (51 vs 64.9). For everything except the hardest 10-20% of tasks, GLM-5.2 is now competitive. At 4-7x lower cost. With self-hostable weights.

For production teams, the practical impact:

The “default” tier for agentic coding shifts from closed-frontier to GLM-5.2 for most workloads.
Claude Fable 5 (or Opus 4.8) becomes the quality-ceiling fallback for the hardest tasks.
Cost-per-task drops substantially for any team that adopts the routing pattern.
Sovereignty-required deployments now have a credible top-tier option.

What GLM-5.2 is not

Not multimodal. No vision input. For screenshot-to-code, scientific figure analysis, or visual debugging, you still need Fable 5, Opus 4.8, GPT-5.5, or Kimi K2.7 Code.
Not the smartest model in the world. Claude Fable 5 still leads the Intelligence Index at 64.9.
Not the cheapest per task. DeepSeek V4 Pro is still slightly cheaper for many workloads given GLM-5.2’s high output-token consumption.
Not new architecture territory. It is a refined MoE, not a paradigm shift.

The honest read

GLM-5.2 is the most important open-weight release of June 2026 — possibly of the year so far. The combination of #1-open-weight Intelligence Index score, 1M context window, MIT license, and 4-7x cost advantage over closed-frontier means it now belongs in every production routing stack that involves agentic coding work.

It is not a replacement for Claude Fable 5. It is the new default for 60-80% of the work that used to require Fable 5 — and that is the meaningful cost shift for any team running agentic coding at scale in late June 2026.