NVIDIA unveils Vera Rubin POD, 1,152-GPU supercomputer with 60 exaflops for agentic AI workloads

NVIDIA Vera Rubin POD Overview

NVIDIA introduced the Vera Rubin POD, a comprehensive AI supercomputer platform purpose-built for agentic AI systems. The platform represents the third-generation evolution of the NVIDIA MGX rack architecture and is designed to handle the unique demands of modern AI agents—including token-intensive reasoning, large KV caches, multi-step workflow orchestration, and CPU-based sandboxing environments.

Five Specialized Rack-Scale Systems

The Vera Rubin POD consists of five distinct, interconnected rack-scale systems:

NVL72: The core compute engine with 72 Rubin GPUs and 36 Vera CPUs, designed for pretraining, post-training, test-time scaling, and agentic scaling. Delivers up to 4x better training performance and 10x better inference performance per watt compared to Blackwell.
Groq 3 LPX: Features 256 LPUs per rack for extreme low-latency inference workloads.
Vera CPU: Provides 256 CPUs per rack for large-scale reinforcement learning and sandboxed environments.
BlueField-4 STX: AI-native storage system with CMX technology for KV cache management.
Spectrum-6 SPX: Silicon photonics-based networking for low-latency, resilient connectivity.

Performance and Scale Metrics

The complete Vera Rubin POD delivers:

1,152 Rubin GPUs across 40 racks with 1.2 quadrillion transistors
60 exaflops of compute performance
10 PB/s total scale-up bandwidth
45°C liquid cooling for energy efficiency
Support for mixture-of-experts routing and heavy compute-bound inference phases

Architecture Innovations

The platform leverages extreme co-design of seven chip types spanning compute, networking, and storage. Key MGX rack features include modular cable-free design, dynamic power steering, rack-level energy storage, intelligent power smoothing, and innovative cooling solutions. All racks share consistent power, cooling, and mechanical envelopes for seamless integration.

Ecosystem and Deployment

NVIDIA's open MGX standard includes partnerships with over 80 suppliers, enabling global supply chain support and accelerating deployment timelines. The platform is optimized for modern agentic AI paradigms including mixture-of-experts models, reinforcement learning, and large context memory workloads.

NVIDIA Vera Rubin POD Overview

Five Specialized Rack-Scale Systems

Performance and Scale Metrics

Architecture Innovations

Ecosystem and Deployment

Tags

Published

Source