What Is Nvidia Vera Rubin? The Platform Powering Agentic AI
What Is Nvidia Vera Rubin?
Nvidia Vera Rubin is a next-generation rack-scale AI computing platform unveiled at GTC 2026. Named after the astronomer whose work revealed dark matter, it combines new Vera CPUs, Rubin GPUs, and Groq 3 LPUs into a unified system designed for agentic AI workloads.
Last verified: March 2026
Quick Facts
| Detail | Info |
|---|---|
| Announced | GTC 2026 (March 16–19, San Jose) |
| Architecture | Vera CPU + Rubin GPU + Groq 3 LPU |
| Key config | NVL72: 72 Rubin GPUs + 36 Vera CPUs |
| Target workload | Agentic AI, reinforcement learning, inference |
| Status | Full production (March 2026) |
| Revenue target | Part of Nvidia’s $1T order pipeline through 2027 |
Architecture Overview
The Vera Rubin platform is built from seven chip types working together at rack scale:
Vera CPU
Nvidia’s first custom CPU designed specifically for AI workloads. Unlike general-purpose server CPUs, Vera is optimized for:
- Agent orchestration and scheduling
- CPU-native AI workloads
- Rack-scale confidential computing
- Zero-downtime maintenance
Rubin GPU
The successor to Blackwell, designed for next-generation AI compute:
- Higher throughput for inference workloads
- Optimized for continuous reasoning tasks
- Improved memory bandwidth for large context windows
Groq 3 LPU (Language Processing Unit)
A new chip category targeting ultra-fast inference:
- 1,500 tokens/second target throughput
- Purpose-built for agentic AI inference
- Designed to handle the rapid back-and-forth of AI agents taking actions
Why Vera Rubin Matters
The shift from Blackwell to Vera Rubin reflects a fundamental change in AI computing needs:
| Era | Focus | Platform |
|---|---|---|
| Training era (2023–2025) | Building large models | Blackwell |
| Agentic era (2026+) | Running AI agents at scale | Vera Rubin |
Agentic AI workloads are fundamentally different from training:
- Continuous inference — Agents reason constantly, not in batches
- Low latency — Agents need fast responses to take real-time actions
- CPU + GPU — Agent orchestration needs CPUs alongside GPUs
- Scale — Millions of concurrent AI agents need massive infrastructure
NVL72 Configuration
The flagship Vera Rubin NVL72 system combines:
- 72 Rubin GPUs — Parallel AI processing
- 36 Vera CPUs — Agent orchestration and CPU workloads
- Rack-scale design — Five rack configurations available
- Context memory storage — Built-in storage for long agent contexts
- Confidential computing — Hardware-level security for enterprise
Space-1: Vera Rubin in Orbit
Nvidia also revealed Space-1 Vera Rubin, a system designed to bring AI data centers into orbit — extending accelerated computing from Earth to space for satellite and autonomous systems.
Who’s Using It
Major cloud providers and AI companies are expected to deploy Vera Rubin through 2026–2027, as agentic AI workloads drive demand for specialized inference infrastructure.
Jensen Huang stated Nvidia sees $1 trillion in orders for Blackwell and Vera Rubin through 2027, underscoring the scale of the agentic AI buildout.
Last verified: March 2026