Vera CPU Targets AI Infrastructure Bottlenecks
The NVIDIA Vera CPU addresses a critical constraint in modern AI systems: CPU-bound serial tasks within agentic loops that limit throughput despite powerful GPUs. As reasoning models and reinforcement learning systems evolve, token demand is straining traditional CPU architectures. Vera is purpose-built to handle two key workload classes: reinforcement learning for model training (such as code generation tasks) and agentic inference that enables AI agents to interact with tools and sandboxes.
Architecture and Performance
The Vera CPU features 88 custom Olympus cores with NVIDIA Spatial Multithreading (SMT) and a monolithic die design connected to adjacent dielets via a high-bandwidth coherent interconnect. Key specifications include:
- 1.2TB/s memory bandwidth with LPDDR5X SOCAMM modules
- 14 GB/s per core uniform memory bandwidth
- Up to 50% faster sandbox performance versus competitive platforms
- 4x sandbox density and 2x performance per watt compared to x86-based racks
- Deterministic latency and sustained high IPC critical for RL feedback loops
Deployment Options and Timeline
Vera comes in multiple platform configurations: tightly coupled Vera Rubin NVL72 racks, standalone liquid-cooled CPU racks, and flexible single/dual-socket servers for AI factory deployments. The architecture is optimized for both direct attachment to accelerators and standalone CPU workloads, maximizing infrastructure ROI. Commercial availability from major OEMs is expected in the second half of 2026.