← Back
NVIDIA
NVIDIA Vera Rubin POD combines seven specialized chips for agentic AI, delivering 60 exaflops across five rack systems
· releaseplatformfeatureperformance · developer.nvidia.com ↗

NVIDIA Vera Rubin POD: The Agentic AI Supercomputer

NVIDIA has introduced the Vera Rubin POD, a purpose-built, rack-scale AI supercomputer designed to address the computational demands of agentic AI systems. This platform represents a major architectural shift from traditional large language model training infrastructure toward systems optimized for multi-agent reasoning, tool invocation, and continuous workflows.

Key Specifications and Architecture

The Vera Rubin POD is a massive integrated system featuring:

  • 40 racks with 1.2 quadrillion transistors and nearly 20,000 NVIDIA dies
  • 1,152 NVIDIA Rubin GPUs delivering 60 exaflops of total compute
  • 10 PB/s aggregate bandwidth for interconnect and storage
  • Seven specialized chips across compute, networking, and storage domains
  • Built on the third-generation NVIDIA MGX rack architecture

Five Specialized Rack-Scale Systems

The POD integrates five purpose-built systems working as a cohesive unit:

  1. NVL72: Core compute engine with 72 Rubin GPUs per rack, optimized for the "four scaling laws" (pretraining, post-training, test-time scaling, agentic scaling). Delivers 4x better training performance and 10x better inference performance per watt versus Blackwell, plus 10x better inference performance/watt for mixture-of-experts workloads.

  2. Groq 3 LPX: Low-latency inference engine with 256 LPUs per rack, designed for real-time agent decision-making and tool invocation.

  3. Vera CPU: Dense CPU sandbox with 256 CPUs per rack for large-scale reinforcement learning, code execution environments, and agent validation.

  4. BlueField-4 STX: AI-native storage system with CMX technology for KV cache management and massive context memory support.

  5. Spectrum-6 SPX: Silicon photonics-based networking for low-latency, resilient connectivity between rack systems.

Purpose-Built for Agentic AI

This platform addresses the emerging paradigm where AI agents interact with each other in continuous loops, requiring:

  • Massive KV cache management for long-context reasoning
  • Low-latency inference for real-time decision-making
  • Dense CPU resources for sandboxed execution and validation
  • Mixture-of-experts routing for specialized sub-models
  • Reinforcement learning infrastructure for agent refinement

Deployment and Ecosystem

The Vera Rubin POD leverages NVIDIA's open MGX standard with over 80 global partners in the supply chain. All rack systems share compatible power, cooling, and mechanical envelopes, enabling rapid deployment and integration. The platform targets energy-efficient data center operations for the next generation of AI workloads dominated by agent-to-agent interactions rather than human-to-AI interactions.