Megatron Energon

Type: Library Tags: NVIDIA, Megatron Energon, data loading, multimodal data, WebDataset, JSONL, distributed training, PyTorch, Megatron Core, Megatron-LM Related: Megatron-Core, Megatron-LM, NVIDIA-NeMo, NeMo-Megatron-Bridge, Nemotron-Training-Recipes, PyTorch, NVIDIA-DALI, NeMo-Curator, NVIDIA-Optimized-Frameworks, NVIDIA-Resiliency-Extension Sources: https://docs.nvidia.com/megatron-core/developer-guide/latest/user-guide/features/megatron_energon.html, https://nvidia.github.io/Megatron-Energon/, https://nvidia.github.io/Megatron-Energon/basic/quickstart.html, https://nvidia.github.io/Megatron-Energon/basic/data_prep.html, https://github.com/NVIDIA/Megatron-Energon Last Updated: 2026-04-29

Summary

Megatron Energon is NVIDIA’s multimodal data-loading library for Megatron-scale training. Current NVIDIA docs describe it as an advanced dataloader for efficient loading of text, image, video, and audio data at scale, with distributed loading, dataset blending, WebDataset/JSONL-oriented formats, resumability, packing, grouping, joining, object-storage streaming, and command-line data-preparation tools.

Detail

Purpose

Large multimodal training jobs are often bottlenecked by data movement, decoding, shuffling, blending, and resumability. Megatron Energon addresses that data-input layer for Megatron-Core and Megatron-LM workflows, while remaining usable outside Megatron when a project needs large-scale multimodal dataset loading.

This page is the canonical wiki target for Megatron Energon. Do not split the quickstart, data-preparation tutorials, WebDataset layout, remote dataset guide, packing/grouping/joining features, CLI commands, or API module pages into separate wiki pages unless NVIDIA publishes a distinct durable product/topic around them.

Current capabilities

Multimodal sample loading for text, images, video, and audio.
Distributed loading across workers, processes, and multi-node clusters.
Dataset blending with configurable weights and metadataset support.
WebDataset-oriented storage with Energon metadata, plus JSONL support for simpler cases.
Save/restore of data-loading state so training can resume reproducibly.
Packing, grouping, joining, subsets, epochized blending, custom sample loaders, and reproducible scaling features.
Remote dataset access, including S3-compatible object storage patterns in current docs.
CLI utilities such as energon prepare, energon info, energon lint, energon mount, and energon preview.

NVIDIA stack context

Megatron Energon complements, rather than replaces, other NVIDIA data tools:

NeMo-Curator prepares and filters large training datasets before training.
NVIDIA-DALI accelerates decode/augmentation pipelines, especially image/video/audio preprocessing.
Megatron Energon focuses on dataset format, multimodal sample loading, distributed sharding, blending, and resumable iteration for large training jobs.

Connections

Megatron-Core - current Megatron Core docs surface Megatron Energon as a feature for large-scale multimodal training data loading.
Megatron-LM - reference implementation that can use Megatron Energon in multimodal training flows.
NVIDIA-NeMo and NeMo-Megatron-Bridge - adjacent training stack where Megatron data-loading and checkpointing workflows matter.
Nemotron-Training-Recipes - long-running recipe jobs need reproducible data iteration and resume behavior.
PyTorch - Megatron Energon is a Python package used in PyTorch-style training loops.
NVIDIA-DALI and NeMo-Curator - complementary data loading/preprocessing and data curation layers.
NVIDIA-Optimized-Frameworks - container context for running Megatron/Energon training environments.
NVIDIA-Resiliency-Extension - related resiliency package for restart/checkpoint behavior around long-running jobs.

Source Excerpts

NVIDIA docs describe Megatron Energon as an advanced multimodal dataloader for text, image, video, and audio at scale.
Current docs list WebDataset with extra metadata and JSONL support as dataset formats.
The GitHub README describes Energon as the multi-modal data loader of Megatron that can also be used independently.

AIPS BOOM

Explorer

Megatron-Energon

Megatron Energon

Summary

Detail

Purpose

Current capabilities

NVIDIA stack context

Connections

Source Excerpts

Resources

Graph View

Table of Contents

Backlinks