NVIDIA Vera Rubin POD: A New Generation of AI Supercomputing
NVIDIA has announced the Vera Rubin POD, a comprehensive platform for agentic AI workloads that represents a significant evolution in data center-scale AI infrastructure. The platform integrates five specialized rack-scale systems designed to work together as a single, cohesive AI supercomputer optimized for the emerging paradigm of AI agents interacting with AI.
Five Specialized Rack-Scale Systems
The Vera Rubin POD consists of five purpose-built systems, each addressing distinct workload requirements:
- NVIDIA Vera Rubin NVL72: The core compute engine with 72 Rubin GPUs and 36 Vera CPUs, optimized for the four scaling laws of AI (pretraining, post-training, test-time scaling, and agentic scaling). It delivers up to 10x better inference performance per watt compared to Blackwell.
- Groq 3 LPX: Delivers 256 LPUs per rack for extreme low-latency inference workloads.
- Vera CPU: Provides 256 CPUs per rack for large-scale reinforcement learning and CPU-based sandboxing environments.
- BlueField-4 STX: AI-native storage system with support for KV cache operations.
- Spectrum-6 SPX: Silicon photonics-based networking for low-latency, resilient connectivity across the POD.
Massive Scale and Co-Design
The complete Vera Rubin POD delivers impressive specifications across 40 racks: 1.2 quadrillion transistors, nearly 20,000 NVIDIA dies, 1,152 Rubin GPUs, 60 exaflops of compute power, and 10 PB/s total scale-up bandwidth. This extreme co-design across seven chip types (compute, networking, storage) enables support for modern agentic AI paradigms including mixture-of-experts routing, reinforcement learning, and large context memory operations.
Architecture and Deployment
Built on the third-generation NVIDIA MGX rack architecture, the platform features innovative mechanical and thermal design elements including a modular cable-free design, dynamic power steering, rack-level energy storage, intelligent power smoothing, and 45°C liquid cooling. The platform supports both NVLink-connected (NVL) and Ethernet/Groq LPU-connected (ETL) configurations, with two types of copper spines designed for performance, resiliency, and energy efficiency.
Ecosystem and Availability
NVIDIA is backed by an ecosystem of more than 80 partners with proven experience in bringing large-scale AI systems to market. The open MGX standard and global partner ecosystem aim to accelerate adoption and deployment of rack-scale and POD-scale AI infrastructure.