NVIDIA releases Nemotron 3 family of agentic AI models with multimodal reasoning and safety capabilities

Introducing the Nemotron 3 Family

NVIDIA has released a new generation of open-source Nemotron models designed to work together as a unified agentic AI stack. The family includes specialized models for different components of agentic systems: reasoning, safety moderation, voice interaction, and multimodal understanding. This comprehensive toolkit aims to address the challenges of building scalable, production-grade AI agents that can handle real-world complexity.

Key Models and Capabilities

Nemotron 3 Super is an open hybrid mixture-of-experts (MoE) model optimized for long-context reasoning and multi-agent tasks. It activates only 12B parameters per pass while maintaining high accuracy, addressing the "context explosion" problem that plagues multi-agent systems with massive token histories. The model features a hybrid Mamba-Transformer architecture with support for 1M-token context windows and achieves up to 5x higher throughput than the previous generation.

Nemotron 3 Content Safety provides multimodal, multilingual content moderation for safety guardrailing across different languages and modalities. Nemotron 3 VoiceChat (in early access) enables low-latency, natural, full-duplex voice interactions. Additional models in development include Nemotron 3 Ultra for highest reasoning accuracy among open frontier models and Nemotron 3 Nano Omni for enterprise-grade multimodal understanding.

Performance and Technical Details

Nemotron 3 Super uses NVFP4 precision on NVIDIA Blackwell GPUs, delivering superior efficiency metrics. According to Artificial Analysis evaluations, Nemotron 3 Super NVFP4 ranks among the top open-weight models under 250B parameters, matching intelligence scores from leading alternatives while leading in throughput per GPU. The model supports configurable "thinking budgets" for chain-of-thought reasoning, allowing developers to keep latency and costs predictable.

Developer Tools and Deployment

NVIDIA provides open data, training recipes, and NeMo tools to support developers building agentic systems. The toolkit includes the NeMo Evaluator for robust benchmarking and an Agent Toolkit for end-to-end optimization. The Nemotron RAG collection includes embedding and reranking models for multimodal retrieval-augmented generation, supporting both image and text modalities.

Introducing the Nemotron 3 Family

Key Models and Capabilities

Performance and Technical Details

Developer Tools and Deployment

Tags

Published

Source