NVIDIA Vera CPU Optimized for AI Factory Workloads
NVIDIA has unveiled the Vera CPU, a custom processor purpose-built for modern AI infrastructure. Featuring 88 custom Olympus cores with NVIDIA Spatial Multithreading and a second-generation Scalable Coherency Fabric, Vera addresses the bottleneck created when GPU-accelerated workloads are constrained by CPU-bound serial tasks in agentic loops.
Key Performance Specifications
The Vera CPU delivers:
- Up to 50% faster agentic sandbox performance compared to x86-based competitive platforms
- 1.2 TB/s total memory bandwidth with uniform 14 GB/s per core delivery
- Extreme single-core performance for executing individual tasks sustaining under constant load
- 4x sandbox density and 2x performance per watt over x86-based racks for AI factory deployments
Architecture and Design Focus
The monolithic die design with adjacent dielets supports deterministic latency and efficient rack-scale deployment. Key architectural choices include LPDDR5X SOCAMM modules for consistent memory access and a NUMA-first topology that couples cores, caches, and memory controllers tightly. These design decisions enable the sustained execution throughput necessary for both reinforcement learning post-training loops and agentic inference workloads that demand thousands of concurrent sandbox environments.
Use Cases and Deployment
Vera targets two primary workload classes: RL post-training for training specialized models in domains like coding and engineering, and agentic actions that enable AI agents to use tools such as web browsers, databases, and code interpreters. The platform options include tightly coupled Vera Rubin NVL72 racks, liquid-cooled CPU-only racks, and flexible single/dual-socket servers for varying deployment scenarios.
Commercial availability is expected from major OEMs in the second half of 2026.