NVIDIA Vera Rubin POD: Purpose-Built Supercomputer for Agentic AI
NVIDIA introduced the Vera Rubin POD, a comprehensive AI infrastructure platform optimized for the emerging era of agentic AI systems. The POD represents extreme co-design across seven chip types (compute, networking, storage) and comprises five distinct purpose-built rack-scale systems that function together as one cohesive supercomputer.
Key System Specifications
The Vera Rubin POD delivers massive scale with 40 racks, 1,152 NVIDIA Rubin GPUs, 1.2 quadrillion transistors, and approximately 20,000 NVIDIA dies. Total performance reaches 60 exaflops with 10 PB/s bandwidth—designed to handle the compute-intensive demands of modern AI agents that perform reasoning, planning, tool invocation, and multi-step workflows.
Five Specialized Rack-Scale Systems
- NVL72: The core compute engine with 72 Rubin GPUs and 36 Vera CPUs per rack, optimized for the four AI scaling laws (pretraining, post-training, test-time scaling, and agentic scaling). Delivers 4x better training performance and 10x better inference performance per watt compared to Blackwell.
- Groq 3 LPX: Provides 256 LPUs per rack for extreme low-latency inference capabilities.
- Vera CPU: Delivers 256 CPUs per rack for large-scale reinforcement learning and sandboxed environments.
- BlueField-4 STX: AI-native storage with CMX for KV cache management.
- Spectrum-6 SPX: Silicon photonics-based networking for low-latency, resilient connectivity.
Architecture and Deployment
All racks share the third-generation NVIDIA MGX architecture with unified power, cooling, and mechanical specifications, enabling rapid deployment and seamless integration. The open MGX standard supports an ecosystem of over 80 partners with established global supply chains for large-scale AI systems.
Innovative MGX rack features include a modular cable-free design, dynamic power steering, rack-level energy storage, intelligent power smoothing, and 45°C liquid cooling—all designed to maximize reliability, serviceability, and energy efficiency for next-generation AI data centers.