AI agents · OpenClaw · self-hosting · automation

Quick Answer

What Is NVIDIA RTX Spark? 1-Petaflop Windows AI PC Chip

Published:

What Is NVIDIA RTX Spark? 1-Petaflop Windows AI PC Chip

NVIDIA’s RTX Spark is the first PC superchip purpose-built for on-device AI agents. Announced June 1, 2026 at Computex Taipei, it pairs a 20-core Grace CPU with a Blackwell RTX GPU and delivers 1 petaflop of AI compute in a laptop.

Last verified: June 4, 2026

Quick specs

ComponentDetail
CPU20-core NVIDIA Grace (Arm), built with MediaTek
GPUBlackwell RTX, 6,144 CUDA cores, 5th-gen Tensor Cores, FP4
InterconnectNVLink-C2C chip-to-chip
AI computeUp to 1 petaflop (FP4)
MemoryUp to 128 GB unified LPDDR5X (16/32/64/128 GB SKUs)
OS targetWindows on Arm
AnnouncedJune 1, 2026 (Computex / GTC Taipei)
ShipsFall 2026
OEMsDell, HP, Lenovo, ASUS, MSI, Microsoft Surface (Acer, Gigabyte later)

Why RTX Spark matters

Until June 2026, the AI PC story was incremental NPU upgrades — 40, 50, then 80 TOPS. RTX Spark jumps to 1,000+ TOPS (1 petaflop) by putting a full Blackwell RTX GPU and Grace CPU in a single package. That’s roughly 20x the AI compute of a Copilot+ NPU and enough to run frontier-class models locally.

It’s also the first time NVIDIA has shipped a discrete PC CPU + GPU as a unified superchip, breaking the company’s traditional discrete-GPU-only PC strategy.

What you can run locally

With 128 GB unified memory and 1 petaflop FP4:

  • Llama 5 70B at full FP16 with room for context
  • MoE models up to ~200B total parameters (active ~30-50B) via FP4 quantization
  • Local image gen — Flux, SDXL at high resolution, in real time
  • Local video gen — short clips from open models
  • Long-context agents — million-token windows in memory without paging
  • Real-time voice agents — Whisper + LLM + TTS pipeline end-to-end on-device

NVIDIA showed agentic workloads running entirely offline at the Computex keynote — a key proof point for the “personal AI” pitch.

Software stack

Unlike previous Arm Windows efforts, RTX Spark ships with the full CUDA and RTX ecosystem:

  • CUDA 13+ for Arm
  • TensorRT-LLM, vLLM, Ollama
  • NVIDIA NIM microservices
  • Windows AI Foundry integration
  • “Windows-native agents” — Microsoft’s new agent runtime tied to the platform

This is a much bigger software story than Qualcomm’s Snapdragon X Elite launch, which struggled with x86 emulation and ML framework gaps.

RTX Spark vs the competition

ChipAI computeUnified memoryOS
NVIDIA RTX Spark~1 petaflop (FP4)Up to 128 GBWindows-on-Arm
Apple M5 Max~38 TOPSUp to 128 GBmacOS
AMD Ryzen AI Max+ 395~50 TOPS NPU + ~120 TFLOPS GPUUp to 128 GBWindows / Linux
Intel Panther Lake~180 TOPS NPUDDR5 (not unified)Windows
Qualcomm Snapdragon X2 Elite~80 TOPS NPUUp to 64 GBWindows-on-Arm

RTX Spark’s AI compute is in a different league. The trade-off is power draw, x86 compatibility (emulation only), and almost certainly a high price tag.

When can you buy one?

  • Fall 2026 — first OEM laptops and compact desktops
  • No pricing yet — expect $2,500+ for high-memory configurations
  • Microsoft Surface RTX Spark variant is confirmed
  • Dell, HP, Lenovo, ASUS, MSI lead the launch wave

Bottom line

RTX Spark is NVIDIA’s bet that AI agents will run on-device, not in the cloud. By shipping a 1-petaflop CUDA-capable superchip in a PC form factor, NVIDIA is forcing a step-change in what counts as an “AI PC.” If pricing lands reasonably, this becomes the obvious platform for local frontier models in late 2026.