High-Performance Architecture for AI Workloads
NVIDIA has unveiled the Vera CPU, a purpose-built processor designed to address performance bottlenecks in modern AI infrastructure. The chip features 88 custom Olympus cores with NVIDIA's Spatial Multithreading (SMT) technology and delivers 1.2 TB/s of memory bandwidth—addressing the CPU-bound serial tasks that limit agentic AI loops according to Amdahl's law.
Key Performance Characteristics
The Vera CPU is optimized for two critical AI workload patterns:
- Reinforcement learning (RL) post-training: Models generate code on accelerators, then ship to CPU clusters for building, testing, and evaluation in feedback loops
- Agentic inference: AI agents execute tools like browsers, databases, and code interpreters in sandbox environments at scale
The architecture delivers:
- Up to 50% faster sandbox performance versus competitive platforms
- 1.2 TB/s memory bandwidth for consistent performance under load
- 14 GB/s per core uniform memory bandwidth via LPDDR5X SOCAMM modules
- 4x sandbox density and 2x performance-per-watt over x86-based racks
Design and Deployment
Vera employs a monolithic die design with adjacent dielets connected via a second-generation Scalable Coherency Fabric, enabling deterministic latency and high instructions-per-cycle (IPC). The platform supports multiple deployment options:
- Tightly coupled Vera Rubin NVL72 racks with accelerators
- Standalone liquid-cooled CPU racks
- Flexible single and dual-socket server configurations
Availability: Major OEMs will begin commercial availability in the second half of 2026.