Nemotron

Type: Model Tags: NVIDIA, Nemotron, LLM, multimodal, speech, OCR, content safety, agentic AI, NIM, training recipes Related: NVIDIA-NeMo, Nemotron-Training-Recipes, NeMo-AutoModel, NeMo-RL, NeMo-Megatron-Bridge, NeMo-Customizer, NeMo-Evaluator, NVIDIA-NIM, NIM-for-Large-Language-Models, NIM-for-Vision-Language-Models, Nemotron-3-Nano, Nemotron-3-Super, Nemotron-3-Ultra, Nemotron-3-Nano-Omni, Nemotron-Parse, NeMo-Retriever-Embedding-NIM, Llama-Nemotron-Embed-1B-v2, Llama-Nemotron-Rerank-1B-v2, Llama-Nemotron-Embed-VL-1B-v2, Llama-Nemotron-Rerank-VL-1B-v2, NVIDIA-Speech-NIM-Microservices, NVIDIA-ASR-NIM, Nemotron-ASR-Streaming, Nemotron-3-VoiceChat, NVIDIA-TTS-NIM, NVIDIA-NMT-NIM, NVIDIA-NemoGuard-NIMs, Nemotron-3-Content-Safety, Nemotron-Content-Safety-Reasoning-4B-Experimental-NIM, Llama-3.1-Nemotron-Safety-Guard-8B-NIM, Llama-3.1-NemoGuard-8B-ContentSafety-NIM, NVIDIA-AI-Blueprints, NVIDIA-AI-Q-Blueprint, NVIDIA-Data-Flywheel-Blueprint, NVIDIA-Agent-Intelligence-Toolkit, NeMo-Retriever, NeMo-Guardrails, TensorRT-LLM, NVIDIA-DGX-Cloud, NVIDIA-NemoClaw, NVIDIA-Enterprise-Reference-Architectures, Red-Hat-AI-Factory-with-NVIDIA Sources: https://build.nvidia.com/models, https://build.nvidia.com/blueprints, https://build.nvidia.com/nvidia/nemotron-3-super-120b-a12b/modelcard, https://build.nvidia.com/nvidia/nemotron-3-nano-30b-a3b/modelcard, https://build.nvidia.com/nvidia/nemotron-3-nano-omni-30b-a3b-reasoning, https://build.nvidia.com/nvidia/nemotron-3-content-safety/modelcard, https://build.nvidia.com/nvidia/nemotron-asr-streaming/modelcard, https://build.nvidia.com/nvidia/nemotron-voicechat/modelcard, https://docs.nvidia.com/nemotron/nightly/usage-cookbook/Nemotron-3-Ultra-Base/README.html, https://developer.nvidia.com/nemotron, https://developer.nvidia.com/blog/introducing-nemotron-3-super-an-open-hybrid-mamba-transformer-moe-for-agentic-reasoning/, https://blogs.nvidia.com/blog/nemotron-3-nano-omni-multimodal-ai-agents/, https://developer.nvidia.com/blog/nvidia-nemotron-3-nano-omni-powers-multimodal-agent-reasoning-in-a-single-efficient-open-model, https://developer.nvidia.com/blog/building-nvidia-nemotron-3-agents-for-reasoning-multimodal-rag-voice-and-safety/, https://nvidianews.nvidia.com/news/nvidia-debuts-nemotron-3-family-of-open-models, https://docs.nvidia.com/nemotron/latest/index.html, https://docs.nvidia.com/nemotron/latest/nemotron/nano3/README.html, https://docs.nvidia.com/nemotron/latest/nemotron/super3/README.html, https://docs.nvidia.com/nemotron/latest/usage-cookbook/Nemotron-3-Super/OpenScaffoldingResources/README.html, https://docs.nvidia.com/nemotron/latest/use-case-examples/Simple%20Nemotron-3-Nano%20Usage%20Example/README.html, https://docs.nvidia.com/nemo/megatron-bridge/latest/models/llm/nemotron3.html, https://docs.nvidia.com/nemo/megatron-bridge/latest/models/llm/nemotron3-super.html, https://docs.nvidia.com/nemo/microservices/latest/customizer/models/index.html, https://docs.nvidia.com/nemo/microservices/latest/fine-tune/models/llama-nemotron.html, https://docs.nvidia.com/nemo/microservices/latest/customizer/models/embedding.html, https://docs.nvidia.com/nemo/automodel/latest/model-coverage/llm.html, https://docs.nvidia.com/nemo/automodel/latest/model-coverage/llm/nvidia/nemotron.html, https://docs.nvidia.com/nim/vision-language-models/latest/support-matrix.html, https://docs.nvidia.com/nim/vision-language-models/latest/release-notes.html, https://docs.nvidia.com/nim/vision-language-models/latest/examples/nemotron-3-nano-omni-30b-a3b-reasoning/api.html, https://docs.nvidia.com/nim/vision-language-models/latest/examples/nemotron-parse/api.html, https://huggingface.co/nvidia/NVIDIA-Nemotron-Parse-v1.2, https://docs.nvidia.com/nim/large-language-models/latest/day-0/get-started-nemotron-content-safety-reasoning-4b.html, https://docs.nvidia.com/nim/speech/latest/index.html, https://docs.nvidia.com/nim/llama-3-1-nemotron-safety-guard-8b/latest/index.html, https://docs.nvidia.com/nemo/automodel/latest/index.html, https://docs.nvidia.com/nemo/rl/latest/about/model-support.html, https://docs.nvidia.com/nemo/megatron-bridge/latest/index.html, https://docs.nvidia.com/ai-enterprise/deployment/red-hat-ai-factory/latest/overview.html Last Updated: 2026-04-29

Summary

Nemotron is NVIDIA’s family of open and hosted AI models for agentic reasoning, instruction following, safety, retrieval, speech, OCR, and multimodal workflows. Current NVIDIA docs place Nemotron across NeMo-Customizer tested model catalogs, NeMo-AutoModel model coverage, NVIDIA-NIM serving surfaces, speech NIMs, VLM NIMs, and build.nvidia.com model cards.

Detail

Purpose

Nemotron gives NVIDIA a model family that can be trained and customized through NVIDIA-NeMo, deployed through NVIDIA-NIM, optimized on NVIDIA GPUs, and used as the reasoning/model layer for enterprise agents and AI applications.

Current model directions

NVIDIA context

Nemotron is central to NVIDIA’s agentic AI stack: NVIDIA-NIM exposes model endpoints, NVIDIA-Agent-Intelligence-Toolkit orchestrates workflows, NeMo-Retriever connects proprietary data, NeMo-Guardrails applies policy/safety, and NVIDIA-DGX-Cloud or self-hosted GPUs provide deployment infrastructure.

Connections

Source Excerpts

  • build.nvidia.com lists recent NVIDIA-published Nemotron models across reasoning, safety, speech, OCR, retrieval, and multimodal categories.
  • Current NeMo Customizer docs list multiple Nemotron and Llama Nemotron models in the tested model catalog, including reasoning LLMs and a Llama Nemotron embedding model.
  • Current Nemotron and Megatron Bridge docs provide dedicated coverage for Nemotron 3 Nano and Nemotron 3 Super.
  • Current NVIDIA Nemotron nightly docs describe Nemotron 3 Ultra Base as a 550B-total-parameter, 55B-active-per-token base checkpoint expected to receive a full release in 1H 2026.
  • Current NeMo AutoModel docs list Nemotron/Minitron and Nemotron H model coverage for Hugging Face-compatible training and fine-tuning.
  • Current VLM NIM release notes introduce NVIDIA Nemotron 3 Nano Omni in release 1.7.0.
  • Current VLM NIM release notes list Nemotron-Parse-v1.2 as the updated Nemotron Parse release with a changed API.
  • NVIDIA’s Nemotron 3 Content Safety model card identifies a multilingual multimodal safety model for prompts, images, and responses.
  • NVIDIA’s Nemotron ASR Streaming card describes a 600M-parameter English streaming ASR model.
  • NVIDIA’s Nemotron 3 VoiceChat model card describes a 12B full-duplex speech-to-speech model for realtime conversational AI.

Resources