Overview
NVIDIA has released Nemotron-Nano-9B-v2-Japanese, a Japanese-specialized variant of its efficient small language model. The model achieves top performance on the Nejumi Leaderboard 4 in the sub-10B parameter category, addressing a critical gap in the Japanese enterprise AI market where few models combine both advanced Japanese proficiency and agentic task execution capabilities.
Key Features & Capabilities
The model maintains the proven architecture of the English-optimized Nemotron-Nano-9B-v2 while adding specialized Japanese language capabilities. Notable capabilities include:
- Advanced Japanese language understanding with strong reasoning and knowledge retention
- Robust tool-calling and agent abilities for function invocation and multi-step workflows
- Instruction-following and alignment across multiple dimensions
- Efficient parameter count (9B) enabling on-premises deployment without significant infrastructure investment
Training Approach
The development leverages two foundational pillars:
Nemotron-Nano-9B-v2 Architecture: A proven baseline with excellent size-to-performance ratio, adapted from the English-trained model.
Nemotron-Personas-Japan Dataset: An open-source (CC BY 4.0) dataset of 6M demographically and geographically representative Japanese personas used as seed data for synthetic data generation. This culturally-grounded approach ensures training data reflects real-world Japanese diversity and use cases, particularly for tool-calling scenarios.
The training pipeline combines:
- Continued pretraining on Japanese OSS corpora (Wikipedia, fineweb-2-ja, aozorabunko, sip3-ja-general-web-corpus)
- Synthetic data generation (SDG) using Nemotron-Personas-Japan as seed set
- Supervised fine-tuning (SFT) with tool-calling and instruction-following data
- Training uses Megatron-LM and NeMo Curator tools
Enterprise Relevance
The model addresses specific Japanese enterprise needs:
- On-premises deployment: Sub-10B size enables private network deployment for organizations handling sensitive data
- Customization efficiency: Strong base capabilities reduce fine-tuning cycles compared to foundation models
- Agent development: Proven agent architecture supports rapid prototyping of multi-agent systems without large-model overhead
Availability
The model is available on Hugging Face at nvidia/NVIDIA-Nemotron-Nano-9B-v2-Japanese. Related datasets including Nemotron-Personas for other regions (US, India, Singapore, Brazil) are also available, enabling similar customization approaches across markets.