NVIDIA unveils Vera CPU for AI factories; promises 50% faster agentic sandbox performance

High-Performance Architecture for AI Workloads

NVIDIA has unveiled the Vera CPU, a purpose-built processor designed to address performance bottlenecks in modern AI infrastructure. The chip features 88 custom Olympus cores with NVIDIA's Spatial Multithreading (SMT) technology and delivers 1.2 TB/s of memory bandwidth—addressing the CPU-bound serial tasks that limit agentic AI loops according to Amdahl's law.

Key Performance Characteristics

The Vera CPU is optimized for two critical AI workload patterns:

Reinforcement learning (RL) post-training: Models generate code on accelerators, then ship to CPU clusters for building, testing, and evaluation in feedback loops
Agentic inference: AI agents execute tools like browsers, databases, and code interpreters in sandbox environments at scale

The architecture delivers:

Up to 50% faster sandbox performance versus competitive platforms
1.2 TB/s memory bandwidth for consistent performance under load
14 GB/s per core uniform memory bandwidth via LPDDR5X SOCAMM modules
4x sandbox density and 2x performance-per-watt over x86-based racks

Design and Deployment

Vera employs a monolithic die design with adjacent dielets connected via a second-generation Scalable Coherency Fabric, enabling deterministic latency and high instructions-per-cycle (IPC). The platform supports multiple deployment options:

Tightly coupled Vera Rubin NVL72 racks with accelerators
Standalone liquid-cooled CPU racks
Flexible single and dual-socket server configurations

Availability: Major OEMs will begin commercial availability in the second half of 2026.

High-Performance Architecture for AI Workloads

Key Performance Characteristics

Design and Deployment

Tags

Published

Source