AI agents · OpenClaw · self-hosting · automation

Quick Answer

GPT-5.6 Sol vs Terra vs Luna: Which Tier to Use (June 2026)

Published:

GPT-5.6 Sol vs Terra vs Luna: Which Tier to Use (June 2026)

OpenAI’s GPT-5.6 family — previewed June 26, 2026 — ships in three tiers: Sol (flagship), Terra (mainstream), and Luna (cost-optimized). Choosing between them is the highest-leverage decision for any team about to migrate from GPT-5.5. Short answer: route by workload type. Sol for hard reasoning and agent planning, Terra for everyday work, Luna for high-volume background tasks. Most production teams should use all three with model cascading.

Last verified: June 27, 2026.

TL;DR

  • Sol — $5 input / $30 output per 1M tokens. Use for: agentic coding, security research, long-horizon planning, frontier reasoning.
  • Terra — $2.50 input / $15 output per 1M tokens. Use for: general ChatGPT/API workloads, mainstream coding, conversational agents, summarization.
  • Luna — $1 input / $6 output per 1M tokens. Use for: classification, batch processing, agent sub-steps, embedded LLM calls.
  • Best pattern: model cascading — Sol for planning, Terra for execution, Luna for routine steps.
  • Pricing ratio: clean 5:2.5:1 across all three tiers on both input and output.

The three tiers side-by-side

DimensionSolTerraLuna
Input price (per 1M)$5.00$2.50$1.00
Output price (per 1M)$30.00$15.00$6.00
PositioningFlagshipMainstreamCost-optimized
Benchmark Terminal-Bench 2.188.8% (base) / 91.9% (Ultra)Not yet publishedNot yet published
Comparable prior modelNew frontier (above GPT-5.5)Similar to GPT-5.5Similar to GPT-5.5 Mini or below
Best forHard reasoning, agent planning, securityEveryday API and ChatGPTHigh-volume batch, embedded
Closest competitorClaude Mythos 5, Claude Fable 5Gemini 3.5 Pro, Claude Sonnet 4.xGemini 3.5 Flash, Claude Haiku

When to use Sol

Sol is the flagship. The right question is not “should I use Sol?” but “is this workload worth Sol’s per-token cost?”

Use Sol for:

  • Agentic coding with long autonomy bounds — multi-hour Claude-Code-style sessions, large refactors, complex test-driven development loops
  • Security research and exploit reasoning — Sol’s headline efficiency claim is ~3x output-token reduction on ExploitBench versus prior frontier
  • Hard reasoning — competitive math, theorem proving, complex multi-step deduction, expert-level scientific reasoning
  • Architecture and planning steps in larger workflows — high stakes per call, quality dominates cost
  • Single-shot expert tasks — code review of critical PRs, security audits, advisory-quality writing

Don’t use Sol for:

  • Routine code completion
  • Classification, extraction, simple Q&A
  • High-volume batch tasks
  • Workflows where per-task quality is already saturated at Terra-tier capability

The cost-quality math: at $30 per 1M output tokens, a single long Sol agent session can easily cost $5-$20. That’s appropriate for a senior-engineer-equivalent task, wasteful for routine work.

When to use Terra

Terra is the workhorse. For most teams, Terra will handle 70-90% of LLM calls once it’s generally available.

Use Terra for:

  • General ChatGPT-style workloads — chat, writing, brainstorming, summarization
  • Mainstream coding assistance — autocomplete, function-scale generation, code explanation, doc generation
  • Conversational agents — customer support, internal Q&A, knowledge-base assistants
  • Migrations from GPT-5.5 — Terra at half the cost is the default migration target
  • RAG over moderately-sized corpora — where Gemini 3.5 Pro pricing is competitive but you want OpenAI’s ecosystem

Don’t use Terra for:

  • Workloads where Luna’s $1/$6 pricing materially improves unit economics
  • Hardest reasoning tasks (use Sol)
  • Cases where Anthropic or Google models clearly outperform on your evaluation suite

The Terra positioning is explicit: GPT-5.5 performance at half the cost. If you’ve already validated GPT-5.5 for a workload, Terra is the obvious upgrade once available.

When to use Luna

Luna competes with Gemini 3.5 Flash ($0.35/$2.80) and Claude Haiku in the cheap-tier segment. At $1 input / $6 output per 1M tokens, Luna is more expensive than Flash but with OpenAI ecosystem benefits.

Use Luna for:

  • Classification at scale — sentiment, intent, topic, content moderation
  • Simple entity extraction — names, dates, structured data from text
  • Batch summarization — processing millions of items at low per-item cost
  • Agent sub-steps — within a larger workflow where Sol or Terra runs the hard steps
  • Embedded LLM calls — in apps where LLM is one component of a larger system
  • High-volume background tasks — anything where the daily volume is in the millions

Don’t use Luna for:

  • Tasks where you’d notice the quality drop versus Terra
  • Interactive, customer-facing flows where quality matters per call
  • Agent planning or architecture steps
  • Final-output writing where you ship the result to a user

The Luna math: at $1/$6 vs Terra’s $2.50/$15, Luna is 60% cheaper. For a workload processing 10M items/month at 500 input + 200 output tokens each, Luna costs about $11K/month vs Terra at about $28K — a $17K/month savings if quality is acceptable.

The highest-leverage architectural pattern in mid-2026 is mixing tiers within a single workflow. Example for a coding agent:

1. Plan the change           → Sol      (high stakes, low volume)
2. Search the codebase       → Luna     (high volume, simple)
3. Generate the patch        → Terra    (mainstream coding)
4. Generate tests            → Terra    (mainstream coding)
5. Review the patch          → Sol      (high stakes, low volume)
6. Generate the commit msg   → Luna     (simple, formatted output)

If the average step uses ~5K input + ~2K output tokens, the per-task costs are roughly:

  • All Sol: ~$0.40 per task
  • All Terra: ~$0.20 per task
  • All Luna: ~$0.08 per task
  • Cascaded (2 Sol + 2 Terra + 2 Luna): ~$0.23 per task

Cascading delivers ~42% cost reduction versus all-Sol while preserving Sol-quality decisions at the two steps where it matters. Over 100K tasks per month, that’s ~$17K saved.

Routing decision tree

A simple decision tree for picking the tier per call:

Is this a "hard" task (frontier reasoning, security, agent planning)?
  YES → Sol
  NO  → Is this an interactive user-facing call?
          YES → Terra (or Sol if quality-critical)
          NO  → Is this a high-volume background task?
                  YES → Luna
                  NO  → Terra

Tools to implement cascading

  • OpenRouter — multi-provider routing, including all GPT-5.6 tiers once available
  • Helicone — observability + routing + caching
  • Portkey — model abstraction layer with fallbacks and routing
  • OpenAI native — the upcoming GPT-5.6 SDK is expected to include routing primitives
  • Custom wrapper — for narrow needs, a thin function-call wrapper is often enough

What to do today (June 27, 2026)

  1. Cleared access: start mapping your workloads to tiers now. Run Sol on hard tasks; benchmark Terra vs GPT-5.5 on mainstream; benchmark Luna vs GPT-5.5 Mini and Gemini Flash on cheap-tier.
  2. No cleared access (most teams): plan the cascading architecture against current models. Use GPT-5.5 as Sol-equivalent today; Gemini Pro or Claude Sonnet as Terra-equivalent; Gemini Flash or Claude Haiku as Luna-equivalent. Swap GPT-5.6 tiers in when GA opens.
  3. For all teams: abstract the model layer. Pick a router (OpenRouter, Helicone, Portkey) and start routing today, so the GPT-5.6 migration is a config change, not a code refactor.