NVIDIA unveils Vera Rubin POD: 1,152 GPUs delivering 60 exaflops for agentic AI workloads

NVIDIA Vera Rubin POD Overview

NVIDIA introduced the Vera Rubin POD, a comprehensive AI supercomputer platform engineered for the agentic AI era. Built through extreme co-design of seven distinct chip types spanning compute, networking, and storage, the platform delivers unprecedented scale and efficiency for modern AI workloads.

Platform Architecture & Specifications

The Vera Rubin POD comprises five specialized rack-scale systems:

NVL72: Core compute engine with 72 Rubin GPUs and 36 Vera CPUs, connected via NVLink. Optimized for pretraining, post-training, test-time scaling, and agentic scaling, with up to 4x better training performance and 10x better inference efficiency per watt compared to Blackwell.
Groq 3 LPX: Delivers 256 LPUs per rack for extreme low-latency inference, ideal for agent interaction loops.
Vera CPU: Provides 256 CPUs per rack for large-scale reinforcement learning and sandboxed environments for validating AI-generated results.
BlueField-4 STX: AI-native storage system with CMX technology for KV cache management and massive context memory.
Spectrum-6 SPX: Silicon photonics-based networking for low-latency, resilient connectivity across the supercomputer.

The complete POD scales to 40 racks with 1.2 quadrillion transistors, nearly 20,000 NVIDIA dies, 1,152 Rubin GPUs, 60 exaflops of compute, and 10 PB/s total bandwidth.

Design & Deployment Features

Built on the third-generation NVIDIA MGX rack architecture, the platform features:

Modular cable-free design for simplified deployment and serviceability
Dynamic power steering and intelligent power smoothing for energy optimization
Rack-level energy storage for improved reliability
45°C liquid cooling for efficient thermal management

NVIDIA's open MGX standard is supported by an ecosystem of over 80 partners with global supply chain expertise, enabling fast deployments and seamless transitions between different rack types.

Developer Impact

The Vera Rubin POD is optimized for modern agentic AI paradigms including mixture-of-experts routing, reinforcement learning, and large context memory requirements. Organizations deploying agentic systems can leverage this purpose-built infrastructure to handle the demanding requirements of multi-step AI agent workflows, tool invocation, and continuous reasoning workloads.

NVIDIA Vera Rubin POD Overview

Platform Architecture & Specifications

Design & Deployment Features

Developer Impact

Tags

Published

Source