Nemotron 3 Nano

Type: Model / NIM microservice Tags: NVIDIA, Nemotron, LLM, reasoning, agentic AI, MoE, Mamba, NIM, NeMo, Megatron Bridge, training recipes Related: Nemotron, Nemotron-Training-Recipes, Nemotron-3-Super, Nemotron-3-Nano-Omni, NVIDIA-NIM, NIM-for-Large-Language-Models, NVIDIA-NeMo, NeMo-Megatron-Bridge, NeMo-AutoModel, NeMo-RL, NeMo-Gym, NeMo-Run, NeMo-Data-Designer, NeMo-Evaluator, NVIDIA-Agent-Intelligence-Toolkit, TensorRT-LLM, vLLM, Megatron-LM, NVIDIA-AI-Q-Blueprint, NVIDIA-Data-Flywheel-Blueprint Sources: https://build.nvidia.com/nvidia/nemotron-3-nano-30b-a3b/modelcard; https://docs.nvidia.com/nemo/megatron-bridge/latest/models/llm/nemotron3.html; https://docs.nvidia.com/nemotron/latest/nemotron/nano3/README.html; https://docs.nvidia.com/nemotron/latest/nemotron/nano3/pretrain.html; https://docs.nvidia.com/nemotron/latest/nemotron/nano3/sft.html; https://docs.nvidia.com/nemotron/latest/nemotron/nano3/rl.html; https://docs.nvidia.com/nemo/gym/latest/index.html; https://docs.nvidia.com/nemotron/latest/use-case-examples/Simple%20Nemotron-3-Nano%20Usage%20Example/README.html; https://developer.nvidia.com/blog/introducing-nemotron-3-super-an-open-hybrid-mamba-transformer-moe-for-agentic-reasoning/; https://developer.nvidia.com/nemotron Last Updated: 2026-04-29

Summary

Nemotron 3 Nano is NVIDIA’s 30B-total, 3.5B-active text LLM in the Nemotron 3 family. Current NeMo-Megatron-Bridge docs describe it as a unified model for reasoning and non-reasoning tasks, with pretraining, full-parameter fine-tuning, LoRA, and Hugging Face/Megatron checkpoint conversion support.

Detail

Purpose

Nemotron 3 Nano is the efficient text-reasoning model in the Nemotron 3 family. NVIDIA positions it for targeted, high-frequency steps inside agentic workflows, while Nemotron-3-Super handles more complex planning/reasoning and Nemotron-3-Nano-Omni handles multimodal perception across image, video, audio, documents, and UI screens.

Model profile

  • Total parameters: 30B.
  • Active parameters: 3.5B.
  • Architecture: hybrid mixture-of-experts model with Mamba-2/MoE layers and attention layers.
  • Current Megatron Bridge docs describe 23 Mamba-2 and MoE layers, 6 attention layers, 128 experts plus 1 shared expert per MoE layer, and 5 experts activated per token.
  • Current Nemotron usage docs identify the model as nvidia/nemotron-3-nano-30b-a3b in OpenRouter examples.
  • The build.nvidia.com model page lists the model as an NVIDIA NIM model entry, while the Megatron Bridge page uses nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 for Hugging Face/Megatron workflows.

Training and customization

Current NeMo-Megatron-Bridge docs support Hugging Face to Megatron import/export, pretraining, full-parameter fine-tuning, and LoRA fine-tuning for Nemotron 3 Nano. The docs specify the custom container nvcr.io/nvidia/nemo:25.11.nemotron_3_nano and advise running from /opt/Megatron-Bridge.

The same docs provide training-scale guidance: pretraining examples use TP=4, EP=8, PP=1, CP=1 and recommend 4 H100 nodes for the shown pretraining configuration; full-parameter fine-tuning examples default to TP=1, EP=8, PP=1, CP=1 and require at least 2 H100 nodes in the documented recipe.

Nemotron-Training-Recipes adds the public cookbook view of the same model: Stage 0 pretraining on a 25T-token curriculum and long-context extension, Stage 1 SFT with role-based loss masking and packed chat data, Stage 2 RL with NeMo-RL GRPO/Ray/vLLM workflows, and evaluation/import paths. The recipe docs explicitly distinguish the open-source public recipe data from the full data used for released model benchmarks.

Agent workflows

The current Nemotron simple usage guide covers basic inference, reasoning mode toggles, LangChain/LangGraph memory, web-search agents, and a multi-agent coordinator pattern using Nemotron 3 Nano. This makes Nano important for users asking about efficient Nemotron agent construction rather than just model architecture.

Use this page for the text-only Nemotron 3 Nano reasoning model. Use Nemotron-3-Nano-Omni for the newer omnimodal Nano Omni model, Nemotron-3-Super for high-capacity agentic planning, and Nemotron for the family-level page.

Connections

Source Excerpts

  • “3.5B active parameters and 30B parameters in total”
  • “unified model for both reasoning and non-reasoning tasks”
  • “Reasoning Modes - Toggle chain-of-thought thinking ON/OFF”

Resources