NeMo AutoModel

Summary

NeMo AutoModel is NVIDIA’s PyTorch-native training library under the NeMo Framework for accelerated Hugging Face model training and fine-tuning. Current NVIDIA docs position it as a day-0 Hugging Face-compatible path for LLMs, VLMs, diffusion models, and related workflows, with optimized recipes, PyTorch DTensor/SPMD scaling, FSDP2, MegatronFSDP, FP8, QAT, LoRA/QLoRA, distillation, tool calling, and distributed checkpointing.

Detail

Purpose

AutoModel lets teams keep Hugging Face-style model loading and checkpoint formats while using NVIDIA-optimized distributed training components. It is useful when a model should remain compatible with the Hugging Face ecosystem but still needs NVIDIA performance work such as fused attention, Transformer-Engine, DeepEP, FlexAttn, FP8, and multi-node scaling.

Current scope

SFT, PEFT/LoRA, QLoRA, tool/function calling fine-tuning, sequence classification, QAT, knowledge distillation, and pretraining recipes.
LLM, VLM, omni-model, diffusion, and retrieval dataset support in the current docs surface.
Current LLM coverage includes NemotronForCausalLM for Nemotron 3, Nemotron 4, and Minitron; NemotronHForCausalLM for Nemotron Nano 9B/12B and Nemotron-3-Nano 30B-A3B; and Nemotron Flash coverage in the broader LLM table.
Current Nemotron usage docs include a Nemotron 3 Super fine-tuning path with NeMo AutoModel, making Nemotron-3-Super part of the practical AutoModel/Nemotron workflow surface.
PyTorch-native parallelism using DTensor, FSDP2, pipeline parallelism, tensor/sequence/pipeline/data parallel strategies, and MegatronFSDP.
Hugging Face-compatible model loading and checkpoint output so trained checkpoints can stay interoperable.
Launch paths for local workstations, Slurm clusters, cloud/SkyPilot, and NeMo-Run.

NVIDIA context

AutoModel sits between open Hugging Face model ecosystems and NVIDIA’s training/deployment stack. It complements managed NeMo-Customizer workflows, can feed NeMo-Evaluator or NeMo-RL, and can hand trained outputs toward NeMo-Export-Deploy, TensorRT-LLM, vLLM, Triton-Inference-Server, or NVIDIA-NIM.

Connections

NVIDIA-NeMo - parent framework family for model training, post-training, evaluation, and deployment.
Nemotron-Training-Recipes - adjacent recipe/cookbook path for Nemotron training that uses Megatron Bridge and NeMo RL rather than AutoModel as its core low-level recipe engine.
NeMo-Run - job-launch path for AutoModel workloads across local, Docker, Slurm, and Kubernetes-style environments.
NeMo-RL - post-training library that can use AutoModel as a PyTorch training backend for Hugging Face models.
NeMo-Megatron-Bridge - higher-scale Megatron Core bridge/training path for large LLM/VLM models and checkpoint conversion.
NeMo-Export-Deploy - export/deploy path for AutoModel and Hugging Face checkpoints.
NeMo-Customizer - managed NeMo Platform service for API-driven fine-tuning; AutoModel is the code/library-level counterpart.
NeMo-Evaluator - trained or fine-tuned models need quality, retrieval, and agent evaluation.
Megatron-LM - related NVIDIA distributed training lineage used for scaling large transformer models.
Transformer-Engine - optimized transformer layer and low-precision recipe library adjacent to AutoModel’s FP8 and fused-attention paths.
TensorRT-LLM, vLLM, and Triton-Inference-Server - downstream inference and serving paths.
Nemotron, Nemotron-3-Nano, and Nemotron-3-Super - current AutoModel/Nemotron coverage includes Nemotron/Minitron, Nemotron 3 Nano, and Super fine-tuning workflows.
NVIDIA-AI-Enterprise - enterprise support and software context around NeMo training and deployment workflows.

Source Excerpts

NVIDIA docs describe AutoModel as a PyTorch SPMD training library for LLMs and VLMs with Hugging Face compatibility.
Current docs list SFT, PEFT, tool calling, VLM fine-tuning, diffusion fine-tuning, QAT, distillation, and pretraining workflows.
Current model coverage docs list Nemotron/Minitron architecture support and point Minitron users to Hugging Face-compatible fine-tuning recipes.

AIPS BOOM

Explorer

NeMo-AutoModel

NeMo AutoModel

Summary

Detail

Purpose

Current scope

NVIDIA context

Connections

Source Excerpts

Resources

Graph View

Table of Contents

Backlinks