What is the difference between GPT-5.6 Sol, Terra, and Luna?

Three tiers, three price points, three workload targets. Sol is the flagship — strongest capability, designed for agentic coding, security research, and long-horizon reasoning; $5 input / $30 output per 1M tokens. Terra is the mainstream — performance roughly comparable to GPT-5.5 but at half the price; intended for general API and ChatGPT use; $2.50 / $15 per 1M tokens. Luna is the cost-optimized tier — designed for batch processing, high-volume background tasks, and embedded applications; $1 / $6 per 1M tokens. The pricing ratio (5:2.5:1 input, 5:2.5:1 output) is clean and intentional. Choose Sol for hard problems, Terra for everyday work, Luna for volume.

Is GPT-5.6 Terra actually as good as GPT-5.5?

Per OpenAI, yes — Terra is positioned as 'performance similar to GPT-5.5 but at half the cost.' That's a strong claim and the typical pattern with mid-tier models is that 'similar' means roughly 90-95% of flagship performance on most benchmarks with some edge cases where the gap is larger. Expect Terra to be slightly weaker than GPT-5.5 on hardest-reasoning tasks (frontier math, complex agentic coding, novel problem domains) but indistinguishable for the majority of mainstream API and ChatGPT use cases (writing, coding assistance for routine tasks, summarization, conversational agents). The 50% cost cut is the headline; if Terra is even 95% of GPT-5.5 on your workload, the migration is almost certainly worth it. Validate on your actual evaluation suite before fully migrating.

When should I use GPT-5.6 Luna instead of GPT-5.6 Terra?

Use Luna when per-task cost dominates and quality requirements are modest. Specific workload patterns: classification (sentiment, intent, topic), simple entity extraction, content moderation, batch summarization where you process millions of items, embedded LLM calls inside a larger system (e.g. agent sub-steps), and any high-volume background task where the cost difference matters at scale. Use Terra when you need GPT-5.5-class reasoning at half the cost. Use Sol when you need the absolute best performance. As a rough rule of thumb: if you're spending more than $1,000/month on a single workload, the cost difference between Luna ($1/$6) and Terra ($2.50/$15) is worth a careful quality comparison. Below that volume, Terra's better quality usually wins.

Can I mix GPT-5.6 tiers in a single workflow?

Yes — and you should. The recommended pattern in mid-2026 is model routing within a single workflow. Example architecture for a coding agent: (1) Sol for the planning and architecture step (hard reasoning, high stakes per token). (2) Terra for code generation and refactoring steps (mainstream quality, moderate volume). (3) Luna for routine sub-steps like linting suggestions, doc lookups, simple code reformatting (high volume, low complexity per call). This pattern is sometimes called 'model cascading' and can cut total workflow cost by 60-80% versus running Sol end-to-end while preserving quality at the steps that matter. Tools that make routing easier: OpenRouter, Helicone, Portkey, and OpenAI's own routing primitives in the upcoming GPT-5.6 SDK.

Quick Answer

GPT-5.6 Sol vs Terra vs Luna: Which Tier to Use (June 2026)

Published: June 27, 2026

GPT-5.6 Sol vs Terra vs Luna: Which Tier to Use (June 2026)

OpenAI’s GPT-5.6 family — previewed June 26, 2026 — ships in three tiers: Sol (flagship), Terra (mainstream), and Luna (cost-optimized). Choosing between them is the highest-leverage decision for any team about to migrate from GPT-5.5. Short answer: route by workload type. Sol for hard reasoning and agent planning, Terra for everyday work, Luna for high-volume background tasks. Most production teams should use all three with model cascading.

Last verified: June 27, 2026.

TL;DR

Sol — $5 input / $30 output per 1M tokens. Use for: agentic coding, security research, long-horizon planning, frontier reasoning.
Terra — $2.50 input / $15 output per 1M tokens. Use for: general ChatGPT/API workloads, mainstream coding, conversational agents, summarization.
Luna — $1 input / $6 output per 1M tokens. Use for: classification, batch processing, agent sub-steps, embedded LLM calls.
Best pattern: model cascading — Sol for planning, Terra for execution, Luna for routine steps.
Pricing ratio: clean 5:2.5:1 across all three tiers on both input and output.

The three tiers side-by-side

Dimension	Sol	Terra	Luna
Input price (per 1M)	$5.00	$2.50	$1.00
Output price (per 1M)	$30.00	$15.00	$6.00
Positioning	Flagship	Mainstream	Cost-optimized
Benchmark Terminal-Bench 2.1	88.8% (base) / 91.9% (Ultra)	Not yet published	Not yet published
Comparable prior model	New frontier (above GPT-5.5)	Similar to GPT-5.5	Similar to GPT-5.5 Mini or below
Best for	Hard reasoning, agent planning, security	Everyday API and ChatGPT	High-volume batch, embedded
Closest competitor	Claude Mythos 5, Claude Fable 5	Gemini 3.5 Pro, Claude Sonnet 4.x	Gemini 3.5 Flash, Claude Haiku

When to use Sol

Sol is the flagship. The right question is not “should I use Sol?” but “is this workload worth Sol’s per-token cost?”

Use Sol for:

Agentic coding with long autonomy bounds — multi-hour Claude-Code-style sessions, large refactors, complex test-driven development loops
Security research and exploit reasoning — Sol’s headline efficiency claim is ~3x output-token reduction on ExploitBench versus prior frontier
Hard reasoning — competitive math, theorem proving, complex multi-step deduction, expert-level scientific reasoning
Architecture and planning steps in larger workflows — high stakes per call, quality dominates cost
Single-shot expert tasks — code review of critical PRs, security audits, advisory-quality writing

Don’t use Sol for:

Routine code completion
Classification, extraction, simple Q&A
High-volume batch tasks
Workflows where per-task quality is already saturated at Terra-tier capability

The cost-quality math: at $30 per 1M output tokens, a single long Sol agent session can easily cost $5-$20. That’s appropriate for a senior-engineer-equivalent task, wasteful for routine work.

When to use Terra

Terra is the workhorse. For most teams, Terra will handle 70-90% of LLM calls once it’s generally available.

Use Terra for:

General ChatGPT-style workloads — chat, writing, brainstorming, summarization
Mainstream coding assistance — autocomplete, function-scale generation, code explanation, doc generation
Conversational agents — customer support, internal Q&A, knowledge-base assistants
Migrations from GPT-5.5 — Terra at half the cost is the default migration target
RAG over moderately-sized corpora — where Gemini 3.5 Pro pricing is competitive but you want OpenAI’s ecosystem

Don’t use Terra for:

Workloads where Luna’s $1/$6 pricing materially improves unit economics
Hardest reasoning tasks (use Sol)
Cases where Anthropic or Google models clearly outperform on your evaluation suite

The Terra positioning is explicit: GPT-5.5 performance at half the cost. If you’ve already validated GPT-5.5 for a workload, Terra is the obvious upgrade once available.

When to use Luna

Luna competes with Gemini 3.5 Flash ($0.35/$2.80) and Claude Haiku in the cheap-tier segment. At $1 input / $6 output per 1M tokens, Luna is more expensive than Flash but with OpenAI ecosystem benefits.

Use Luna for:

Classification at scale — sentiment, intent, topic, content moderation
Simple entity extraction — names, dates, structured data from text
Batch summarization — processing millions of items at low per-item cost
Agent sub-steps — within a larger workflow where Sol or Terra runs the hard steps
Embedded LLM calls — in apps where LLM is one component of a larger system
High-volume background tasks — anything where the daily volume is in the millions

Don’t use Luna for:

Tasks where you’d notice the quality drop versus Terra
Interactive, customer-facing flows where quality matters per call
Agent planning or architecture steps
Final-output writing where you ship the result to a user

The Luna math: at $1/$6 vs Terra’s $2.50/$15, Luna is 60% cheaper. For a workload processing 10M items/month at 500 input + 200 output tokens each, Luna costs about $11K/month vs Terra at about $28K — a $17K/month savings if quality is acceptable.

The cascading pattern (recommended)

The highest-leverage architectural pattern in mid-2026 is mixing tiers within a single workflow. Example for a coding agent:

1. Plan the change           → Sol      (high stakes, low volume)
2. Search the codebase       → Luna     (high volume, simple)
3. Generate the patch        → Terra    (mainstream coding)
4. Generate tests            → Terra    (mainstream coding)
5. Review the patch          → Sol      (high stakes, low volume)
6. Generate the commit msg   → Luna     (simple, formatted output)

If the average step uses ~5K input + ~2K output tokens, the per-task costs are roughly:

All Sol: ~$0.40 per task
All Terra: ~$0.20 per task
All Luna: ~$0.08 per task
Cascaded (2 Sol + 2 Terra + 2 Luna): ~$0.23 per task

Cascading delivers ~42% cost reduction versus all-Sol while preserving Sol-quality decisions at the two steps where it matters. Over 100K tasks per month, that’s ~$17K saved.

Routing decision tree

A simple decision tree for picking the tier per call:

Is this a "hard" task (frontier reasoning, security, agent planning)?
  YES → Sol
  NO  → Is this an interactive user-facing call?
          YES → Terra (or Sol if quality-critical)
          NO  → Is this a high-volume background task?
                  YES → Luna
                  NO  → Terra

Tools to implement cascading

OpenRouter — multi-provider routing, including all GPT-5.6 tiers once available
Helicone — observability + routing + caching
Portkey — model abstraction layer with fallbacks and routing
OpenAI native — the upcoming GPT-5.6 SDK is expected to include routing primitives
Custom wrapper — for narrow needs, a thin function-call wrapper is often enough

What to do today (June 27, 2026)

Cleared access: start mapping your workloads to tiers now. Run Sol on hard tasks; benchmark Terra vs GPT-5.5 on mainstream; benchmark Luna vs GPT-5.5 Mini and Gemini Flash on cheap-tier.
No cleared access (most teams): plan the cascading architecture against current models. Use GPT-5.5 as Sol-equivalent today; Gemini Pro or Claude Sonnet as Terra-equivalent; Gemini Flash or Claude Haiku as Luna-equivalent. Swap GPT-5.6 tiers in when GA opens.
For all teams: abstract the model layer. Pick a router (OpenRouter, Helicone, Portkey) and start routing today, so the GPT-5.6 migration is a config change, not a code refactor.

GPT-5.6 Sol vs Terra vs Luna: Which Tier to Use (June 2026)

TL;DR

The three tiers side-by-side

When to use Sol

When to use Terra

When to use Luna

The cascading pattern (recommended)

Routing decision tree

Tools to implement cascading

What to do today (June 27, 2026)

Related