AI agents · OpenClaw · self-hosting · automation

Quick Answer

What Is SWE-1.5? Cognition's Fast Agent Model (2026)

Published:

What Is SWE-1.5? Cognition’s Fast Agent Model (2026)

SWE-1.5 is Cognition AI’s frontier coding model — and at 950 tokens per second on Cerebras, it’s the fastest serious AI coding model on the market in 2026. Here’s what it is, how it works, and where it fits in the Claude Opus 4.7 / GPT-5.4 Codex / Composer 2 landscape.

Last verified: April 2026

SWE-1.5 at a Glance

DetailValue
MakerCognition AI
ReleasedOctober 29, 2025
ArchitectureFrontier dense transformer, hundreds of billions of parameters
Inference hardwareCerebras CS-3 wafer-scale
Speed~950 tokens/second
SWE-bench Verified~82% (as of April 2026)
AvailabilityWindsurf, Devin (no public API)
Use caseAgentic coding, autonomous SWE tasks

Who Made It?

SWE-1.5 comes from Cognition AI, the company behind Devin — the autonomous software engineer agent that went viral in early 2024. Cognition acquired Windsurf in 2025, bringing together a frontier lab (Cognition) and a mature AI IDE (Windsurf) under one roof.

Cognition partnered with Cerebras Systems to run inference on Cerebras’s wafer-scale CS-3 hardware, which delivers transformer inference orders of magnitude faster than GPU clusters for single-user latency.

What Makes SWE-1.5 Different

1. Frontier Scale, Coding-Tuned

SWE-1.5 is a full frontier-scale model — hundreds of billions of parameters — specifically post-trained on agentic coding trajectories. Unlike general-purpose models that handle coding as one of many tasks, SWE-1.5 was optimized end-to-end for:

  • Multi-file edits
  • Tool-use reliability (no hallucinated functions)
  • Long agentic loops (50+ steps)
  • Plan-then-execute patterns
  • Test-driven iteration

2. 950 Tokens/Second

The headline feature. Running on Cerebras, SWE-1.5 produces output at roughly 10× the speed of Claude Opus 4.7 and 3× the speed of Cursor’s Composer 2. In practice this means:

  • Code appears faster than you can read it
  • Iteration loops that took minutes complete in seconds
  • Agentic workflows with 30+ tool calls finish in under a minute

3. Agentic-First Training

Cognition trained SWE-1.5 using data from Devin’s production traffic — millions of real agent rollouts with real tool-use traces. That’s a fundamentally different training signal than models trained primarily on GitHub code.

Benchmarks (April 2026)

BenchmarkSWE-1.5Opus 4.7GPT-5.4 CodexComposer 2
SWE-bench Verified~82%87.6%79.2%~76%
Tokens/sec~950~40~90~200
Tool-use reliabilityHighHighestMediumHigh
Context window200K1M1M300K

SWE-1.5 isn’t the most accurate model — Opus 4.7 is — but it’s the fastest accurate-enough model. For short tasks where the correctness ceiling is high regardless, speed wins.

How to Use SWE-1.5

In Windsurf

  1. Open Windsurf (Cognition’s AI IDE, free tier available)
  2. In the model picker, select SWE-1.5 (Cerebras)
  3. Use Cascade mode for agentic workflows — SWE-1.5 shines here
  4. Free tier includes parallel agents and SWE-1.5 access

In Devin

Devin uses SWE-1.5 as its default inference model for short-horizon tasks. It automatically switches to Claude Opus 4.7 for longer or more complex work.

As an API

Not available. Cognition has not released SWE-1.5 via public API. Access is limited to Windsurf and Devin as of April 2026. The Cerebras-dependent deployment makes general API hosting logistically difficult.

When to Pick SWE-1.5 over Opus 4.7

Use casePick
Autocomplete / inline editsSWE-1.5 (speed matters)
Short multi-file refactorsSWE-1.5
Quick bug fixesSWE-1.5
Complex architectural refactorsOpus 4.7 (accuracy matters)
Long-horizon (>30 min) agent runsOpus 4.7
Hard debugging with deep contextOpus 4.7
Rapid iteration / ADHD-friendly codingSWE-1.5

Limitations

  • Not API-accessible — You’re locked into Windsurf or Devin
  • 200K context window — Smaller than Opus 4.7’s 1M tokens
  • Vendor lock-in to Cerebras — If Cerebras capacity drops, speed drops
  • Less general reasoning — Specialized for coding, weaker on math/general chat
  • No multimodal — Text-only, unlike Opus 4.7 or Gemini 3.1 Pro

The Broader Picture

SWE-1.5 represents a bet: that for coding agents, inference speed matters as much as model intelligence. With Cerebras hardware, an agent can run a 30-step tool-use loop in the time Opus 4.7 takes to draft a single response. For workflows like “fix every failing test in the repo” or “apply this refactor across 40 files,” that compound speed advantage is decisive.

Expect this pattern to spread. Cursor 3’s Composer 2 at 200 tok/s on standard GPUs is already a step in the same direction. The era of “slow and smart” and “fast and dumb” is ending — SWE-1.5 is the first model that’s frontier-scale and real-time.

Verdict

SWE-1.5 is the fastest frontier coding model available in April 2026, powering Windsurf and Devin via Cerebras hardware. It’s not the most accurate (Claude Opus 4.7 still wins SWE-bench), but for short-to-medium tasks where speed compounds across many iterations, SWE-1.5 delivers the best productivity per minute.

If you haven’t tried it, open Windsurf’s free tier, select SWE-1.5 in the model picker, and ask it to refactor a file. The responsiveness genuinely changes how you code — tasks that felt like “let me queue this and walk away” become “let me just watch this finish.”