What is Poolside Laguna XS.2?

Laguna XS.2 is a 33-billion-parameter Mixture of Experts model with 3B activated parameters (33B-A3B), released by US AI startup Poolside on April 28, 2026 under the Apache 2.0 license. It's purpose-built for agentic coding and long-horizon work on local hardware. Quantized NVFP4 and INT4 variants make it runnable on a single high-end GPU (H100, RTX 4090, RTX 5090) or NVIDIA's DGX Spark / GB10 mini-systems. Poolside positions it as a Western open-weight alternative to Chinese coding models like Qwen 3.6 and DeepSeek V4.

Should I run Laguna XS.2, Qwen 3.6, or DeepSeek V4 Pro for local coding?

Three rules. (1) Local single-GPU developer machine (RTX 4090, RTX 5090, M3/M4 Max) → Laguna XS.2 — its 33B-A3B MoE architecture is the most efficient at this size. (2) Multi-GPU workstation or small server (2-4× H100, MI325X) and you want best raw quality on coding → DeepSeek V4 Pro — still the SWE-bench leader among open weights at 685B-A37B. (3) Production self-hosted with vLLM / SGLang and you want the best balance of quality, speed, and cost → Qwen 3.6 — the most flexible across sizes (7B, 72B, 235B variants) and broadly compatible with serving stacks.

How does Laguna XS.2 compare to closed models like GPT-5.5 and Opus 4.6?

Laguna XS.2 is competitive at its weight class but not at the frontier. On reported coding benchmarks Laguna XS.2 lands between GPT-OSS-20B and Qwen 3.6-72B — credible for local agentic work but ~10-15 points behind GPT-5.5 / Opus 4.6 on SWE-bench Verified. The point of Laguna XS.2 isn't to beat frontier closed models — it's to give Western developers a local, Apache 2.0-licensed coding model that runs on consumer / prosumer hardware without sending code to any API. For sensitive workloads or air-gapped dev, that beats a SWE-bench point spread.

Why does Apache 2.0 licensing matter for Laguna XS.2?

Apache 2.0 is a permissive license that allows commercial use, modification, redistribution, and private fine-tuning — without the field-of-use restrictions or response-by-research-only terms that some Chinese open models use. For enterprises in regulated sectors (defense, intelligence, US government contracts) Apache 2.0 + US-based vendor is often a hard requirement. Llama 5 is permissive but not Apache; Qwen 3.6 and DeepSeek V4 are permissive but Chinese-origin. Laguna XS.2 fills the gap of being both Apache 2.0 AND US-vendor-backed — a meaningful procurement differentiator regardless of raw benchmark comparison.

Quick Answer

Poolside Laguna XS.2 vs Qwen 3.6 vs DeepSeek V4 (May 2026)

Published: May 7, 2026

Poolside Laguna XS.2 vs Qwen 3.6 vs DeepSeek V4 (May 2026)

On April 28, 2026, US AI startup Poolside released Laguna XS.2 — a 33B-A3B Mixture of Experts model under the Apache 2.0 license, designed for local agentic coding. It’s the most credible Western open-weight coding model in months, and it lands in a market dominated by Chinese open-weight models (Qwen 3.6, DeepSeek V4 Pro, GLM 5.1, Kimi K2.6). Here’s how the three compare for local and self-hosted coding workloads.

Last verified: May 7, 2026

The three at a glance

Capability	Laguna XS.2	Qwen 3.6	DeepSeek V4 Pro
Vendor	Poolside (US)	Alibaba (China)	DeepSeek (China)
Released	April 28, 2026	February 2026 (3.6 series)	March 2026 (V4 Pro)
License	Apache 2.0	Apache 2.0	DeepSeek License (permissive, with FOU)
Architecture	33B-A3B MoE	Dense + MoE variants	685B-A37B MoE
Sizes available	33B-A3B	7B, 72B, 235B-A22B	685B-A37B
Context	256K	256K	128K (1M extended)
Quants	NVFP4, INT4	GGUF, AWQ, INT4	GGUF, AWQ, INT4
Best raw coding	Mid-tier	High (72B+)	Frontier among open
Local machine fit	Excellent (single H100 / RTX 5090)	Mixed (7B easy, 72B hard)	Hard (needs 2-4× H100)
Best for	Local single-GPU dev	Production self-hosted	Highest-quality open coding

What Laguna XS.2 is structurally

Poolside designed Laguna XS.2 specifically for local agentic coding:

33B total parameters, 3B activated — extremely cheap inference per token.
MoE routing that the company tuned for “long-horizon agentic” tasks (multi-step coding work).
Apache 2.0 licensing — no field-of-use restrictions.
NVFP4 and INT4 quants day-one — runs on RTX 5090, single H100, or NVIDIA DGX Spark / GB10 mini-systems.
Hugging Face and direct distribution at huggingface.co/poolside.

The 33B-A3B size means it activates only ~3B parameters per token, so it’s fast enough for tight agentic iteration loops on consumer hardware while having enough total capacity to know about a wide range of code.

Qwen 3.6 in May 2026

Alibaba’s Qwen 3.6 family is the most flexible open-weight series:

Sizes: 7B, 72B, 235B-A22B.
Strong on coding — Qwen 3.6 Coder variants are competitive with DeepSeek V4 Pro at smaller sizes.
Apache 2.0 licensed.
Wide ecosystem support — vLLM, SGLang, Ollama, llama.cpp all have first-class support.
Multilingual — strong on Chinese and English code documentation.

Qwen 3.6 is the safe default for production self-hosted serving in May 2026.

DeepSeek V4 Pro in May 2026

DeepSeek V4 Pro remains the open-weight coding frontier:

685B total / 37B activated MoE — needs serious infrastructure.
Best open-weight SWE-bench Verified score in May 2026.
128K context with 1M extended via YaRN.
Permissive license with field-of-use restrictions (no military, no surveillance).
Cheap inference at scale thanks to MoE sparsity.

DeepSeek V4 Pro is what you run when you have a real GPU cluster and want best-quality open coding.

Where each one wins

Laguna XS.2 wins for…

Solo developers running coding agents on RTX 4090 / RTX 5090 / Apple M3 Max / M4 Max.
Air-gapped or offline development environments.
US government / defense contractors needing US-vendor + Apache 2.0.
Quick local agent prototyping without API costs.
Edge / appliance deployments (NVIDIA DGX Spark, Jetson Orin AGX).

Qwen 3.6 wins for…

Production self-hosted inference at scale with vLLM or SGLang.
Customers wanting model size flexibility (7B for completion, 72B for agents).
Multilingual codebases (English + Chinese / Japanese / Korean docs).
Existing Alibaba Cloud or Tongyi customers.
Best balance of quality, speed, and ecosystem maturity.

DeepSeek V4 Pro wins for…

Highest-quality open-weight coding benchmarks.
Customers with 4-8× H100 / MI325X clusters available.
Long-horizon agentic tasks where deeper reasoning matters.
Multi-tenant inference where MoE sparsity reduces per-request cost.
Workloads that need 1M-token extended context.

Hardware reality

Model	Min single-GPU	Comfortable	Best at scale
Laguna XS.2 (NVFP4)	RTX 4090 (24GB)	RTX 5090 / H100 (80GB)	1-2× H100
Qwen 3.6 7B	RTX 4070 (12GB)	RTX 4090 (24GB)	Single H100
Qwen 3.6 72B	2× RTX 4090 (AWQ INT4)	2× H100	4× H100
Qwen 3.6 235B-A22B	Not feasible	2-4× H100	8× H100 / MI325X
DeepSeek V4 Pro	Not feasible	4× H100 (INT4)	8× H100 / MI325X

Laguna XS.2’s appeal is real here — it’s the only model in this list that runs comfortably on a single consumer card.

Coding capability ranking (May 2026 open-weight, rough)

Per llm-stats.com, the LMSYS Chatbot Arena, and several independent SWE-bench runs in May 2026:

DeepSeek V4 Pro — SWE-bench Verified ~76% (frontier among open weights).
Qwen 3.6 235B-A22B — ~73%.
Kimi K2.6 — ~72%.
GLM 5.1 — ~70%.
Llama 5 405B — ~68%.
Qwen 3.6 72B — ~66%.
Laguna XS.2 33B-A3B — Poolside reports competitive with mid-tier; independent SWE-bench TBD.
DeepSeek V4 Flash, Minimax M2.7 — mid-60s.

Frontier closed (Opus 4.6/4.7, GPT-5.5) sit at ~80-85%. Laguna XS.2 isn’t trying to beat those — it’s trying to be the best small model you can run locally.

Apache 2.0 vs DeepSeek License — does it matter?

For most developers, no. For enterprise procurement in regulated sectors:

Apache 2.0 (Laguna XS.2, Qwen 3.6) — no field-of-use restrictions, full commercial use, modification, redistribution.
DeepSeek License — permissive, but explicitly prohibits military, surveillance, and certain other use cases. Chinese-origin can be a separate procurement issue for US government / defense contracts.
Llama Community License (Llama 5) — permissive but with restrictions for very large customers (>700M MAU) and acceptable use policy.

For US public-sector or defense workloads, Apache 2.0 + US-vendor (Laguna XS.2) is the cleanest profile. That’s the procurement story Poolside is selling.

Bottom line

In May 2026, Laguna XS.2 is the best new option for local single-GPU agentic coding — it’s the only model in this comparison that runs comfortably on consumer hardware while being Apache 2.0 and backed by a US vendor. Qwen 3.6 is the right default for production self-hosted serving thanks to its size flexibility and ecosystem maturity. DeepSeek V4 Pro remains the frontier for open-weight coding quality when you have the GPU budget. The smart pattern: run Laguna XS.2 on developer laptops for sensitive local work, Qwen 3.6 for shared self-hosted inference, and reserve DeepSeek V4 Pro or frontier closed models for the hardest agentic tasks.

Sources: Poolside official announcement (April 28, 2026), Poolside models page (May 2026), Hugging Face poolside org (May 2026), AI CERTs News coverage (May 4, 2026), Softtechhub coverage (May 5, 2026), AI Chronicle (May 2026), NVIDIA Developer Forums (May 4, 2026), llm-stats.com benchmarks (May 2026).