What Is Nvidia Vera Rubin? The Platform Powering Agentic AI

Q: What Is Nvidia Vera Rubin? The Platform Powering Agentic AI

Nvidia Vera Rubin is the next-gen AI platform unveiled at GTC 2026. Learn about its architecture, specs, and why it matters for agentic AI.

Question

What Is Nvidia Vera Rubin?

Nvidia Vera Rubin is a next-generation rack-scale AI computing platform unveiled at GTC 2026. Named after the astronomer whose work revealed dark matter, it combines new Vera CPUs, Rubin GPUs, and Groq 3 LPUs into a unified system designed for agentic AI workloads.

Last verified: March 2026

Quick Facts

Detail	Info
Announced	GTC 2026 (March 16–19, San Jose)
Architecture	Vera CPU + Rubin GPU + Groq 3 LPU
Key config	NVL72: 72 Rubin GPUs + 36 Vera CPUs
Target workload	Agentic AI, reinforcement learning, inference
Status	Full production (March 2026)
Revenue target	Part of Nvidia’s $1T order pipeline through 2027

Architecture Overview

The Vera Rubin platform is built from seven chip types working together at rack scale:

Vera CPU

Nvidia’s first custom CPU designed specifically for AI workloads. Unlike general-purpose server CPUs, Vera is optimized for:

Agent orchestration and scheduling
CPU-native AI workloads
Rack-scale confidential computing
Zero-downtime maintenance

Rubin GPU

The successor to Blackwell, designed for next-generation AI compute:

Higher throughput for inference workloads
Optimized for continuous reasoning tasks
Improved memory bandwidth for large context windows

Groq 3 LPU (Language Processing Unit)

A new chip category targeting ultra-fast inference:

1,500 tokens/second target throughput
Purpose-built for agentic AI inference
Designed to handle the rapid back-and-forth of AI agents taking actions

Why Vera Rubin Matters

The shift from Blackwell to Vera Rubin reflects a fundamental change in AI computing needs:

Era	Focus	Platform
Training era (2023–2025)	Building large models	Blackwell
Agentic era (2026+)	Running AI agents at scale	Vera Rubin

Agentic AI workloads are fundamentally different from training:

Continuous inference — Agents reason constantly, not in batches
Low latency — Agents need fast responses to take real-time actions
CPU + GPU — Agent orchestration needs CPUs alongside GPUs
Scale — Millions of concurrent AI agents need massive infrastructure

NVL72 Configuration

The flagship Vera Rubin NVL72 system combines:

72 Rubin GPUs — Parallel AI processing
36 Vera CPUs — Agent orchestration and CPU workloads
Rack-scale design — Five rack configurations available
Context memory storage — Built-in storage for long agent contexts
Confidential computing — Hardware-level security for enterprise

Space-1: Vera Rubin in Orbit

Nvidia also revealed Space-1 Vera Rubin, a system designed to bring AI data centers into orbit — extending accelerated computing from Earth to space for satellite and autonomous systems.

Who’s Using It

Major cloud providers and AI companies are expected to deploy Vera Rubin through 2026–2027, as agentic AI workloads drive demand for specialized inference infrastructure.

Jensen Huang stated Nvidia sees $1 trillion in orders for Blackwell and Vera Rubin through 2027, underscoring the scale of the agentic AI buildout.

Last verified: March 2026

Answer 1