NVIDIA Vera Rubin POD Overview
NVIDIA introduced the Vera Rubin POD, a comprehensive AI supercomputer platform engineered for the agentic AI era. Built through extreme co-design of seven distinct chip types spanning compute, networking, and storage, the platform delivers unprecedented scale and efficiency for modern AI workloads.
Platform Architecture & Specifications
The Vera Rubin POD comprises five specialized rack-scale systems:
- NVL72: Core compute engine with 72 Rubin GPUs and 36 Vera CPUs, connected via NVLink. Optimized for pretraining, post-training, test-time scaling, and agentic scaling, with up to 4x better training performance and 10x better inference efficiency per watt compared to Blackwell.
- Groq 3 LPX: Delivers 256 LPUs per rack for extreme low-latency inference, ideal for agent interaction loops.
- Vera CPU: Provides 256 CPUs per rack for large-scale reinforcement learning and sandboxed environments for validating AI-generated results.
- BlueField-4 STX: AI-native storage system with CMX technology for KV cache management and massive context memory.
- Spectrum-6 SPX: Silicon photonics-based networking for low-latency, resilient connectivity across the supercomputer.
The complete POD scales to 40 racks with 1.2 quadrillion transistors, nearly 20,000 NVIDIA dies, 1,152 Rubin GPUs, 60 exaflops of compute, and 10 PB/s total bandwidth.
Design & Deployment Features
Built on the third-generation NVIDIA MGX rack architecture, the platform features:
- Modular cable-free design for simplified deployment and serviceability
- Dynamic power steering and intelligent power smoothing for energy optimization
- Rack-level energy storage for improved reliability
- 45°C liquid cooling for efficient thermal management
NVIDIA's open MGX standard is supported by an ecosystem of over 80 partners with global supply chain expertise, enabling fast deployments and seamless transitions between different rack types.
Developer Impact
The Vera Rubin POD is optimized for modern agentic AI paradigms including mixture-of-experts routing, reinforcement learning, and large context memory requirements. Organizations deploying agentic systems can leverage this purpose-built infrastructure to handle the demanding requirements of multi-step AI agent workflows, tool invocation, and continuous reasoning workloads.