NVIDIA unveils Vera Rubin POD; integrates five rack systems with 60 exaflops for agentic AI

Overview

NVIDIA Vera Rubin POD represents a comprehensive rethinking of data center infrastructure for the emerging era of agentic AI—where AI agents interact with AI agents, generating massive volumes of reasoning tokens and KV cache. Built on the third-generation NVIDIA MGX rack architecture, this POD-scale platform integrates five purpose-built rack-scale systems designed to work cohesively as a single AI supercomputer.

Core Specifications

The Vera Rubin POD delivers impressive scale and performance metrics:

Compute: 1,152 NVIDIA Rubin GPUs across 40 racks
Performance: 60 exaflops total, with 10x better inference performance per watt vs. Blackwell
Bandwidth: 10 PB/s scale-up bandwidth
Scale: 1.2 quadrillion transistors and ~20,000 NVIDIA dies

Five Specialized Rack Systems

The platform comprises five distinct rack-scale systems, each optimized for specific workload demands:

NVL72: Core compute engine with 72 Rubin GPUs and 36 Vera CPUs, supporting pretraining, post-training, test-time scaling, and mixture-of-experts (MoE) routing
Groq 3 LPX: Specialized inference rack with 256 LPUs per rack for low-latency processing
Vera CPU: Dense CPU rack with 256 CPUs per rack for reinforcement learning and sandboxed environments
BlueField-4 STX: AI-native storage system with CMX cache for handling large KV cache requirements
Spectrum-6 SPX: Silicon photonics-based networking for low-latency, resilient connectivity

Key Features and Infrastructure

The MGX architecture introduces several innovations for improved reliability and efficiency:

Modular cable-free design for simplified deployment and maintenance
Dynamic power steering and intelligent power smoothing
Rack-level energy storage for improved resilience
45°C liquid cooling to maximize energy efficiency

The platform is built on an open MGX standard with an ecosystem of 80+ global partners, enabling faster deployments and seamless integration across rack types.

Developer Impact

Organizations deploying the Vera Rubin POD can expect significant improvements in:

Token efficiency: Up to 1/10th the token cost for inference compared to Blackwell
Scalability: Support for modern agentic AI paradigms including MoE, reinforcement learning, and large context windows
Energy efficiency: Purpose-built systems eliminate wasted compute cycles across heterogeneous workloads

Overview

Core Specifications

Five Specialized Rack Systems

Key Features and Infrastructure

Developer Impact

Tags

Published

Source