SWE-1.5 is Cognition AI's frontier-sized coding model with hundreds of billions of parameters, specifically tuned for agentic software engineering. It launched in October 2025 and powers Windsurf and Devin. It is served on Cerebras hardware at approximately 950 tokens per second, making it the fastest frontier coding model available in 2026.

Is SWE-1.5 better than Claude Opus 4.7?

On raw coding benchmarks, Claude Opus 4.7 still leads (87.6% SWE-bench Verified vs SWE-1.5's ~82%). SWE-1.5 wins on speed — at 950 tok/s on Cerebras it is roughly 10× faster than Opus 4.7. For short-horizon tasks where speed matters, SWE-1.5 is often the better pick.

How do I use SWE-1.5?

SWE-1.5 is available inside Windsurf (free and paid tiers) and Devin. It is not currently offered as a standalone API. Open Windsurf, select SWE-1.5 in the model picker, and it will route inference through Cerebras for sub-second responses on most tasks.

Quick Answer

What Is SWE-1.5? Cognition's Fast Agent Model (2026)

Published: April 18, 2026

What Is SWE-1.5? Cognition’s Fast Agent Model (2026)

SWE-1.5 is Cognition AI’s frontier coding model — and at 950 tokens per second on Cerebras, it’s the fastest serious AI coding model on the market in 2026. Here’s what it is, how it works, and where it fits in the Claude Opus 4.7 / GPT-5.4 Codex / Composer 2 landscape.

Last verified: April 2026

SWE-1.5 at a Glance

Detail	Value
Maker	Cognition AI
Released	October 29, 2025
Architecture	Frontier dense transformer, hundreds of billions of parameters
Inference hardware	Cerebras CS-3 wafer-scale
Speed	~950 tokens/second
SWE-bench Verified	~82% (as of April 2026)
Availability	Windsurf, Devin (no public API)
Use case	Agentic coding, autonomous SWE tasks

Who Made It?

SWE-1.5 comes from Cognition AI, the company behind Devin — the autonomous software engineer agent that went viral in early 2024. Cognition acquired Windsurf in 2025, bringing together a frontier lab (Cognition) and a mature AI IDE (Windsurf) under one roof.

Cognition partnered with Cerebras Systems to run inference on Cerebras’s wafer-scale CS-3 hardware, which delivers transformer inference orders of magnitude faster than GPU clusters for single-user latency.

What Makes SWE-1.5 Different

1. Frontier Scale, Coding-Tuned

SWE-1.5 is a full frontier-scale model — hundreds of billions of parameters — specifically post-trained on agentic coding trajectories. Unlike general-purpose models that handle coding as one of many tasks, SWE-1.5 was optimized end-to-end for:

Multi-file edits
Tool-use reliability (no hallucinated functions)
Long agentic loops (50+ steps)
Plan-then-execute patterns
Test-driven iteration

2. 950 Tokens/Second

The headline feature. Running on Cerebras, SWE-1.5 produces output at roughly 10× the speed of Claude Opus 4.7 and 3× the speed of Cursor’s Composer 2. In practice this means:

Code appears faster than you can read it
Iteration loops that took minutes complete in seconds
Agentic workflows with 30+ tool calls finish in under a minute

3. Agentic-First Training

Cognition trained SWE-1.5 using data from Devin’s production traffic — millions of real agent rollouts with real tool-use traces. That’s a fundamentally different training signal than models trained primarily on GitHub code.

Benchmarks (April 2026)

Benchmark	SWE-1.5	Opus 4.7	GPT-5.4 Codex	Composer 2
SWE-bench Verified	~82%	87.6%	79.2%	~76%
Tokens/sec	~950	~40	~90	~200
Tool-use reliability	High	Highest	Medium	High
Context window	200K	1M	1M	300K

SWE-1.5 isn’t the most accurate model — Opus 4.7 is — but it’s the fastest accurate-enough model. For short tasks where the correctness ceiling is high regardless, speed wins.

How to Use SWE-1.5

In Windsurf

Open Windsurf (Cognition’s AI IDE, free tier available)
In the model picker, select SWE-1.5 (Cerebras)
Use Cascade mode for agentic workflows — SWE-1.5 shines here
Free tier includes parallel agents and SWE-1.5 access

In Devin

Devin uses SWE-1.5 as its default inference model for short-horizon tasks. It automatically switches to Claude Opus 4.7 for longer or more complex work.

As an API

Not available. Cognition has not released SWE-1.5 via public API. Access is limited to Windsurf and Devin as of April 2026. The Cerebras-dependent deployment makes general API hosting logistically difficult.

When to Pick SWE-1.5 over Opus 4.7

Use case	Pick
Autocomplete / inline edits	SWE-1.5 (speed matters)
Short multi-file refactors	SWE-1.5
Quick bug fixes	SWE-1.5
Complex architectural refactors	Opus 4.7 (accuracy matters)
Long-horizon (>30 min) agent runs	Opus 4.7
Hard debugging with deep context	Opus 4.7
Rapid iteration / ADHD-friendly coding	SWE-1.5

Limitations

Not API-accessible — You’re locked into Windsurf or Devin
200K context window — Smaller than Opus 4.7’s 1M tokens
Vendor lock-in to Cerebras — If Cerebras capacity drops, speed drops
Less general reasoning — Specialized for coding, weaker on math/general chat
No multimodal — Text-only, unlike Opus 4.7 or Gemini 3.1 Pro

The Broader Picture

SWE-1.5 represents a bet: that for coding agents, inference speed matters as much as model intelligence. With Cerebras hardware, an agent can run a 30-step tool-use loop in the time Opus 4.7 takes to draft a single response. For workflows like “fix every failing test in the repo” or “apply this refactor across 40 files,” that compound speed advantage is decisive.

Expect this pattern to spread. Cursor 3’s Composer 2 at 200 tok/s on standard GPUs is already a step in the same direction. The era of “slow and smart” and “fast and dumb” is ending — SWE-1.5 is the first model that’s frontier-scale and real-time.

Verdict

SWE-1.5 is the fastest frontier coding model available in April 2026, powering Windsurf and Devin via Cerebras hardware. It’s not the most accurate (Claude Opus 4.7 still wins SWE-bench), but for short-to-medium tasks where speed compounds across many iterations, SWE-1.5 delivers the best productivity per minute.

If you haven’t tried it, open Windsurf’s free tier, select SWE-1.5 in the model picker, and ask it to refactor a file. The responsiveness genuinely changes how you code — tasks that felt like “let me queue this and walk away” become “let me just watch this finish.”