AI agents · OpenClaw · self-hosting · automation

Quick Answer

Poolside Laguna XS.2 vs Qwen 3.6 vs DeepSeek V4 (May 2026)

Published:

Poolside Laguna XS.2 vs Qwen 3.6 vs DeepSeek V4 (May 2026)

On April 28, 2026, US AI startup Poolside released Laguna XS.2 — a 33B-A3B Mixture of Experts model under the Apache 2.0 license, designed for local agentic coding. It’s the most credible Western open-weight coding model in months, and it lands in a market dominated by Chinese open-weight models (Qwen 3.6, DeepSeek V4 Pro, GLM 5.1, Kimi K2.6). Here’s how the three compare for local and self-hosted coding workloads.

Last verified: May 7, 2026

The three at a glance

CapabilityLaguna XS.2Qwen 3.6DeepSeek V4 Pro
VendorPoolside (US)Alibaba (China)DeepSeek (China)
ReleasedApril 28, 2026February 2026 (3.6 series)March 2026 (V4 Pro)
LicenseApache 2.0Apache 2.0DeepSeek License (permissive, with FOU)
Architecture33B-A3B MoEDense + MoE variants685B-A37B MoE
Sizes available33B-A3B7B, 72B, 235B-A22B685B-A37B
Context256K256K128K (1M extended)
QuantsNVFP4, INT4GGUF, AWQ, INT4GGUF, AWQ, INT4
Best raw codingMid-tierHigh (72B+)Frontier among open
Local machine fitExcellent (single H100 / RTX 5090)Mixed (7B easy, 72B hard)Hard (needs 2-4× H100)
Best forLocal single-GPU devProduction self-hostedHighest-quality open coding

What Laguna XS.2 is structurally

Poolside designed Laguna XS.2 specifically for local agentic coding:

  • 33B total parameters, 3B activated — extremely cheap inference per token.
  • MoE routing that the company tuned for “long-horizon agentic” tasks (multi-step coding work).
  • Apache 2.0 licensing — no field-of-use restrictions.
  • NVFP4 and INT4 quants day-one — runs on RTX 5090, single H100, or NVIDIA DGX Spark / GB10 mini-systems.
  • Hugging Face and direct distribution at huggingface.co/poolside.

The 33B-A3B size means it activates only ~3B parameters per token, so it’s fast enough for tight agentic iteration loops on consumer hardware while having enough total capacity to know about a wide range of code.

Qwen 3.6 in May 2026

Alibaba’s Qwen 3.6 family is the most flexible open-weight series:

  • Sizes: 7B, 72B, 235B-A22B.
  • Strong on coding — Qwen 3.6 Coder variants are competitive with DeepSeek V4 Pro at smaller sizes.
  • Apache 2.0 licensed.
  • Wide ecosystem support — vLLM, SGLang, Ollama, llama.cpp all have first-class support.
  • Multilingual — strong on Chinese and English code documentation.

Qwen 3.6 is the safe default for production self-hosted serving in May 2026.

DeepSeek V4 Pro in May 2026

DeepSeek V4 Pro remains the open-weight coding frontier:

  • 685B total / 37B activated MoE — needs serious infrastructure.
  • Best open-weight SWE-bench Verified score in May 2026.
  • 128K context with 1M extended via YaRN.
  • Permissive license with field-of-use restrictions (no military, no surveillance).
  • Cheap inference at scale thanks to MoE sparsity.

DeepSeek V4 Pro is what you run when you have a real GPU cluster and want best-quality open coding.

Where each one wins

Laguna XS.2 wins for…

  • Solo developers running coding agents on RTX 4090 / RTX 5090 / Apple M3 Max / M4 Max.
  • Air-gapped or offline development environments.
  • US government / defense contractors needing US-vendor + Apache 2.0.
  • Quick local agent prototyping without API costs.
  • Edge / appliance deployments (NVIDIA DGX Spark, Jetson Orin AGX).

Qwen 3.6 wins for…

  • Production self-hosted inference at scale with vLLM or SGLang.
  • Customers wanting model size flexibility (7B for completion, 72B for agents).
  • Multilingual codebases (English + Chinese / Japanese / Korean docs).
  • Existing Alibaba Cloud or Tongyi customers.
  • Best balance of quality, speed, and ecosystem maturity.

DeepSeek V4 Pro wins for…

  • Highest-quality open-weight coding benchmarks.
  • Customers with 4-8× H100 / MI325X clusters available.
  • Long-horizon agentic tasks where deeper reasoning matters.
  • Multi-tenant inference where MoE sparsity reduces per-request cost.
  • Workloads that need 1M-token extended context.

Hardware reality

ModelMin single-GPUComfortableBest at scale
Laguna XS.2 (NVFP4)RTX 4090 (24GB)RTX 5090 / H100 (80GB)1-2× H100
Qwen 3.6 7BRTX 4070 (12GB)RTX 4090 (24GB)Single H100
Qwen 3.6 72B2× RTX 4090 (AWQ INT4)2× H1004× H100
Qwen 3.6 235B-A22BNot feasible2-4× H1008× H100 / MI325X
DeepSeek V4 ProNot feasible4× H100 (INT4)8× H100 / MI325X

Laguna XS.2’s appeal is real here — it’s the only model in this list that runs comfortably on a single consumer card.

Coding capability ranking (May 2026 open-weight, rough)

Per llm-stats.com, the LMSYS Chatbot Arena, and several independent SWE-bench runs in May 2026:

  1. DeepSeek V4 Pro — SWE-bench Verified ~76% (frontier among open weights).
  2. Qwen 3.6 235B-A22B — ~73%.
  3. Kimi K2.6 — ~72%.
  4. GLM 5.1 — ~70%.
  5. Llama 5 405B — ~68%.
  6. Qwen 3.6 72B — ~66%.
  7. Laguna XS.2 33B-A3B — Poolside reports competitive with mid-tier; independent SWE-bench TBD.
  8. DeepSeek V4 Flash, Minimax M2.7 — mid-60s.

Frontier closed (Opus 4.6/4.7, GPT-5.5) sit at ~80-85%. Laguna XS.2 isn’t trying to beat those — it’s trying to be the best small model you can run locally.

Apache 2.0 vs DeepSeek License — does it matter?

For most developers, no. For enterprise procurement in regulated sectors:

  • Apache 2.0 (Laguna XS.2, Qwen 3.6) — no field-of-use restrictions, full commercial use, modification, redistribution.
  • DeepSeek License — permissive, but explicitly prohibits military, surveillance, and certain other use cases. Chinese-origin can be a separate procurement issue for US government / defense contracts.
  • Llama Community License (Llama 5) — permissive but with restrictions for very large customers (>700M MAU) and acceptable use policy.

For US public-sector or defense workloads, Apache 2.0 + US-vendor (Laguna XS.2) is the cleanest profile. That’s the procurement story Poolside is selling.

Bottom line

In May 2026, Laguna XS.2 is the best new option for local single-GPU agentic coding — it’s the only model in this comparison that runs comfortably on consumer hardware while being Apache 2.0 and backed by a US vendor. Qwen 3.6 is the right default for production self-hosted serving thanks to its size flexibility and ecosystem maturity. DeepSeek V4 Pro remains the frontier for open-weight coding quality when you have the GPU budget. The smart pattern: run Laguna XS.2 on developer laptops for sensitive local work, Qwen 3.6 for shared self-hosted inference, and reserve DeepSeek V4 Pro or frontier closed models for the hardest agentic tasks.

Sources: Poolside official announcement (April 28, 2026), Poolside models page (May 2026), Hugging Face poolside org (May 2026), AI CERTs News coverage (May 4, 2026), Softtechhub coverage (May 5, 2026), AI Chronicle (May 2026), NVIDIA Developer Forums (May 4, 2026), llm-stats.com benchmarks (May 2026).