Nemotron 3 Super: A Purpose-Built Model for Agentic AI
NVIDIA has open-sourced Nemotron 3 Super, a 120B parameter model (12B active parameters) specifically designed to address the operational challenges of multi-agent AI systems. The model tackles two key problems in agentic reasoning: the "thinking tax" of running expensive reasoning models for every sub-task, and "context explosion," where agents lose alignment over long task sequences due to accumulated history.
Key Architectural Innovations
The model introduces several cutting-edge techniques to balance efficiency and accuracy:
- Latent MoE: Compresses tokens before routing to experts, enabling 4x more expert specialists for the same inference cost
- Multi-Token Prediction (MTP): Predicts multiple future tokens in a single forward pass, reducing generation time for long sequences and enabling built-in speculative decoding
- Hybrid Mamba-Transformer Backbone: Combines Mamba layers for linear-time sequence processing with Transformer layers for precision reasoning, delivering 4x improved memory and compute efficiency
- Native NVFP4 Pretraining: Optimized for NVIDIA Blackwell, achieving 4x faster inference on B200 vs. FP8 on H100 while maintaining accuracy
- Multi-Environment Reinforcement Learning: Post-trained using NVIDIA NeMo tools across 21 environment configurations with 1.2+ million environment rollouts
Performance and Availability
The model achieves 85.6% on PinchBench—a new benchmark for evaluating LLMs as agentic brains—making it the best-performing open model in its class. It delivers 5x throughput improvements over the previous Nemotron Super while maintaining a native 1M-token context window for long-term agentic memory.
Nemotron 3 Super is fully open with open weights, datasets, and recipes available on Hugging Face, enabling developers to customize, optimize, and deploy it on their own infrastructure. This positions it as a practical choice for autonomous agents in software development, cybersecurity triaging, and other reasoning-heavy applications.