Nemotron 3 Ultra

Type: NVIDIA model Tags: NVIDIA, Nemotron, LLM, open model, base model, reasoning, MoE, Mamba, Transformer, NVFP4, Blackwell Related: Nemotron, Nemotron-3-Nano, Nemotron-3-Super, Nemotron-3-Nano-Omni, Nemotron-Training-Recipes, NeMo-AutoModel, NeMo-RL, NeMo-Megatron-Bridge, NVIDIA-NIM, TensorRT-LLM, NVIDIA-Blackwell-Architecture, NVIDIA-GB200-NVL72 Sources: https://docs.nvidia.com/nemotron/nightly/usage-cookbook/Nemotron-3-Ultra-Base/README.html, https://nvidianews.nvidia.com/news/nvidia-debuts-nemotron-3-family-of-open-models, https://nvidianews.nvidia.com/news/nvidia-expands-open-model-families-to-power-the-next-wave-of-agentic-physical-and-healthcare-ai, https://docs.nvidia.com/ai-enterprise/planning-resource/ai-factory-white-paper/latest/ai-factory-overview.html Last Updated: 2026-04-29

Summary

Nemotron 3 Ultra is NVIDIA’s largest Nemotron 3 open model direction. Current NVIDIA Nemotron docs describe Ultra Base as a 550B-total-parameter, 55B-active-per-token hybrid Mamba-Transformer MoE pretraining base checkpoint with 1M-token context, NVFP4 pretraining, LatentMoE, and Multi-Token Prediction. NVIDIA also positions Ultra as the large reasoning engine in the Nano/Super/Ultra Nemotron 3 family.

Detail

Current status

The current NVIDIA Nemotron Ultra page is in nightly docs and describes Ultra as a base checkpoint, not a post-trained assistant. NVIDIA explicitly frames it as a starting point for customization, fine-tuning, reinforcement-learning post-training, and instruction-tuning pipelines. The same page says weights are expected with the full Nemotron 3 Ultra release in 1H 2026.

Purpose

Nemotron 3 Ultra targets the high-accuracy end of NVIDIA’s open agentic-model stack. While Nemotron-3-Nano targets high-throughput agent steps and Nemotron-3-Super targets higher-capability reasoning, Ultra is positioned for complex planning, deep research, coding, search, and workflow automation where maximum reasoning quality matters more than smallest model footprint.

Model characteristics

550B total parameters in the current Ultra Base documentation.
Up to 55B active parameters per token through a hybrid Mamba-Transformer MoE architecture.
1M-token context length using Mamba-2 layers for long-context efficiency.
NVFP4 pretraining, aligned with Blackwell-era low-precision training/inference direction.
LatentMoE token compression before expert routing, enabling more expert specialists for similar inference cost.
Multi-Token Prediction for coherent reasoning and speculative-decoding-style execution.
Measured by NVIDIA on NVIDIA-GB200-NVL72 systems in the current nightly page.

Important distinction

Do not treat Nemotron 3 Ultra Base as a drop-in chatbot. Use this page for the Ultra model identity, architecture, status, and future customization direction. Use Nemotron-3-Super, Nemotron-3-Nano, Nemotron-3-Nano-Omni, or hosted NVIDIA-NIM endpoints for currently deployable Nemotron workflows.

Connections

Nemotron - family page for NVIDIA Nemotron open models, datasets, and deployment paths.
Nemotron-3-Nano - efficient smaller Nemotron 3 reasoning model.
Nemotron-3-Super - high-capability reasoning model below Ultra in the current Nemotron 3 size ladder.
Nemotron-3-Nano-Omni - omnimodal Nemotron model for text/image/video/audio/document/GUI understanding.
Nemotron-Training-Recipes - training and customization recipe context for Nemotron 3 models.
NeMo-AutoModel, NeMo-RL, and NeMo-Megatron-Bridge - training, post-training, and conversion tooling likely adjacent to future Ultra customization workflows.
NVIDIA-NIM and TensorRT-LLM - deployment and inference optimization context once post-trained or deployable Ultra variants are available.
NVIDIA-Blackwell-Architecture - NVFP4 and Blackwell-era training/inference context.
NVIDIA-GB200-NVL72 - NVIDIA’s current Ultra Base benchmark page references GB200 NVL72 measurement.

Source Excerpts

NVIDIA describes Nemotron 3 Ultra Base as a 550B-parameter, 55B-active-per-token hybrid Mamba-Transformer MoE base model.
NVIDIA’s current materials position Ultra as the large reasoning model in the Nemotron 3 Nano/Super/Ultra family.

AIPS BOOM

Explorer

Nemotron-3-Ultra

Nemotron 3 Ultra

Summary

Detail

Current status

Purpose

Model characteristics

Important distinction

Connections

Source Excerpts

Resources

Graph View

Table of Contents

Backlinks