NVIDIA Vera CPU debuts with 88 Olympus cores, 1.2TB/s bandwidth for AI inference and RL training

Vera CPU Targets AI Infrastructure Bottlenecks

The NVIDIA Vera CPU addresses a critical constraint in modern AI systems: CPU-bound serial tasks within agentic loops that limit throughput despite powerful GPUs. As reasoning models and reinforcement learning systems evolve, token demand is straining traditional CPU architectures. Vera is purpose-built to handle two key workload classes: reinforcement learning for model training (such as code generation tasks) and agentic inference that enables AI agents to interact with tools and sandboxes.

Architecture and Performance

The Vera CPU features 88 custom Olympus cores with NVIDIA Spatial Multithreading (SMT) and a monolithic die design connected to adjacent dielets via a high-bandwidth coherent interconnect. Key specifications include:

1.2TB/s memory bandwidth with LPDDR5X SOCAMM modules
14 GB/s per core uniform memory bandwidth
Up to 50% faster sandbox performance versus competitive platforms
4x sandbox density and 2x performance per watt compared to x86-based racks
Deterministic latency and sustained high IPC critical for RL feedback loops

Deployment Options and Timeline

Vera comes in multiple platform configurations: tightly coupled Vera Rubin NVL72 racks, standalone liquid-cooled CPU racks, and flexible single/dual-socket servers for AI factory deployments. The architecture is optimized for both direct attachment to accelerators and standalone CPU workloads, maximizing infrastructure ROI. Commercial availability from major OEMs is expected in the second half of 2026.

Vera CPU Targets AI Infrastructure Bottlenecks

Architecture and Performance

Deployment Options and Timeline

Tags

Published

Source