← Back
NVIDIA
NVIDIA unveils Vera Rubin POD; integrates five rack systems with 60 exaflops for agentic AI
· releaseplatformperformance · developer.nvidia.com ↗

Overview

NVIDIA Vera Rubin POD represents a comprehensive rethinking of data center infrastructure for the emerging era of agentic AI—where AI agents interact with AI agents, generating massive volumes of reasoning tokens and KV cache. Built on the third-generation NVIDIA MGX rack architecture, this POD-scale platform integrates five purpose-built rack-scale systems designed to work cohesively as a single AI supercomputer.

Core Specifications

The Vera Rubin POD delivers impressive scale and performance metrics:

  • Compute: 1,152 NVIDIA Rubin GPUs across 40 racks
  • Performance: 60 exaflops total, with 10x better inference performance per watt vs. Blackwell
  • Bandwidth: 10 PB/s scale-up bandwidth
  • Scale: 1.2 quadrillion transistors and ~20,000 NVIDIA dies

Five Specialized Rack Systems

The platform comprises five distinct rack-scale systems, each optimized for specific workload demands:

  1. NVL72: Core compute engine with 72 Rubin GPUs and 36 Vera CPUs, supporting pretraining, post-training, test-time scaling, and mixture-of-experts (MoE) routing
  2. Groq 3 LPX: Specialized inference rack with 256 LPUs per rack for low-latency processing
  3. Vera CPU: Dense CPU rack with 256 CPUs per rack for reinforcement learning and sandboxed environments
  4. BlueField-4 STX: AI-native storage system with CMX cache for handling large KV cache requirements
  5. Spectrum-6 SPX: Silicon photonics-based networking for low-latency, resilient connectivity

Key Features and Infrastructure

The MGX architecture introduces several innovations for improved reliability and efficiency:

  • Modular cable-free design for simplified deployment and maintenance
  • Dynamic power steering and intelligent power smoothing
  • Rack-level energy storage for improved resilience
  • 45°C liquid cooling to maximize energy efficiency

The platform is built on an open MGX standard with an ecosystem of 80+ global partners, enabling faster deployments and seamless integration across rack types.

Developer Impact

Organizations deploying the Vera Rubin POD can expect significant improvements in:

  • Token efficiency: Up to 1/10th the token cost for inference compared to Blackwell
  • Scalability: Support for modern agentic AI paradigms including MoE, reinforcement learning, and large context windows
  • Energy efficiency: Purpose-built systems eliminate wasted compute cycles across heterogeneous workloads