NVIDIA Vera Rubin POD Platform
NVIDIA introduced the Vera Rubin POD, a comprehensive AI supercomputer platform purpose-built for agentic AI workloads. The system scales to 40 racks with 1.2 quadrillion transistors, nearly 20,000 NVIDIA dies, and 1,152 Rubin GPUs delivering 60 exaflops of compute power with 10 PB/s total scale-up bandwidth.
Five Specialized Rack-Scale Systems
The Vera Rubin POD integrates five purpose-built rack-scale systems:
- NVL72: Core compute engine with 72 Rubin GPUs and 36 Vera CPUs, optimized for the four AI scaling laws (pretraining, post-training, test-time scaling, agentic scaling). Delivers up to 4x better training performance and 10x better inference performance per watt compared to Blackwell.
- Groq 3 LPX: Enables low-latency inference with 256 LPUs per rack
- Vera CPU: Provides 256 CPUs per rack for large-scale reinforcement learning and sandboxed environments
- BlueField-4 STX: AI-native storage with CMX technology for KV cache management
- Spectrum-6 SPX: Silicon photonics-based networking for low-latency, resilient connectivity
Co-Design Across Seven Chip Types
The platform represents extreme co-design spanning seven chip types across compute, networking, and storage domains. All systems leverage the third-generation NVIDIA MGX rack architecture with two design variants: the MGX NVL rack connected via NVLink, and the new MGX ETL rack connected via Spectrum-X Ethernet or Groq 3 LPU direct chip-to-chip links.
Infrastructure Features for Agentic AI
Key innovations include modular cable-free design, dynamic power steering, rack-level energy storage, intelligent power smoothing, and 45°C liquid cooling. The open MGX standard and 80+ partner ecosystem accelerate deployment and support global supply chains for large-scale AI system deployments.