What Is NVIDIA RTX Spark? 1-Petaflop Windows AI PC Chip
What Is NVIDIA RTX Spark? 1-Petaflop Windows AI PC Chip
NVIDIA’s RTX Spark is the first PC superchip purpose-built for on-device AI agents. Announced June 1, 2026 at Computex Taipei, it pairs a 20-core Grace CPU with a Blackwell RTX GPU and delivers 1 petaflop of AI compute in a laptop.
Last verified: June 4, 2026
Quick specs
| Component | Detail |
|---|---|
| CPU | 20-core NVIDIA Grace (Arm), built with MediaTek |
| GPU | Blackwell RTX, 6,144 CUDA cores, 5th-gen Tensor Cores, FP4 |
| Interconnect | NVLink-C2C chip-to-chip |
| AI compute | Up to 1 petaflop (FP4) |
| Memory | Up to 128 GB unified LPDDR5X (16/32/64/128 GB SKUs) |
| OS target | Windows on Arm |
| Announced | June 1, 2026 (Computex / GTC Taipei) |
| Ships | Fall 2026 |
| OEMs | Dell, HP, Lenovo, ASUS, MSI, Microsoft Surface (Acer, Gigabyte later) |
Why RTX Spark matters
Until June 2026, the AI PC story was incremental NPU upgrades — 40, 50, then 80 TOPS. RTX Spark jumps to 1,000+ TOPS (1 petaflop) by putting a full Blackwell RTX GPU and Grace CPU in a single package. That’s roughly 20x the AI compute of a Copilot+ NPU and enough to run frontier-class models locally.
It’s also the first time NVIDIA has shipped a discrete PC CPU + GPU as a unified superchip, breaking the company’s traditional discrete-GPU-only PC strategy.
What you can run locally
With 128 GB unified memory and 1 petaflop FP4:
- Llama 5 70B at full FP16 with room for context
- MoE models up to ~200B total parameters (active ~30-50B) via FP4 quantization
- Local image gen — Flux, SDXL at high resolution, in real time
- Local video gen — short clips from open models
- Long-context agents — million-token windows in memory without paging
- Real-time voice agents — Whisper + LLM + TTS pipeline end-to-end on-device
NVIDIA showed agentic workloads running entirely offline at the Computex keynote — a key proof point for the “personal AI” pitch.
Software stack
Unlike previous Arm Windows efforts, RTX Spark ships with the full CUDA and RTX ecosystem:
- CUDA 13+ for Arm
- TensorRT-LLM, vLLM, Ollama
- NVIDIA NIM microservices
- Windows AI Foundry integration
- “Windows-native agents” — Microsoft’s new agent runtime tied to the platform
This is a much bigger software story than Qualcomm’s Snapdragon X Elite launch, which struggled with x86 emulation and ML framework gaps.
RTX Spark vs the competition
| Chip | AI compute | Unified memory | OS |
|---|---|---|---|
| NVIDIA RTX Spark | ~1 petaflop (FP4) | Up to 128 GB | Windows-on-Arm |
| Apple M5 Max | ~38 TOPS | Up to 128 GB | macOS |
| AMD Ryzen AI Max+ 395 | ~50 TOPS NPU + ~120 TFLOPS GPU | Up to 128 GB | Windows / Linux |
| Intel Panther Lake | ~180 TOPS NPU | DDR5 (not unified) | Windows |
| Qualcomm Snapdragon X2 Elite | ~80 TOPS NPU | Up to 64 GB | Windows-on-Arm |
RTX Spark’s AI compute is in a different league. The trade-off is power draw, x86 compatibility (emulation only), and almost certainly a high price tag.
When can you buy one?
- Fall 2026 — first OEM laptops and compact desktops
- No pricing yet — expect $2,500+ for high-memory configurations
- Microsoft Surface RTX Spark variant is confirmed
- Dell, HP, Lenovo, ASUS, MSI lead the launch wave
Bottom line
RTX Spark is NVIDIA’s bet that AI agents will run on-device, not in the cloud. By shipping a 1-petaflop CUDA-capable superchip in a PC form factor, NVIDIA is forcing a step-change in what counts as an “AI PC.” If pricing lands reasonably, this becomes the obvious platform for local frontier models in late 2026.