nvFatbin

Type: Technology Tags: NVIDIA, CUDA, fatbin, runtime compilation, module loading, compiler SDK Related: NVIDIA-CUDA, CUDA-Driver-API, PTX-ISA, NVRTC, nvJitLink, NVCC Sources: https://docs.nvidia.com/cuda/nvfatbin/index.html Last Updated: 2026-04-29

Summary

nvFatbin is a CUDA library for creating CUDA fat binaries at runtime. It lets applications package multiple device-code variants, such as cubin, PTX, or LTO-IR inputs, into a fatbin that can later be loaded through the CUDA-Driver-API.

Detail

Purpose

CUDA applications sometimes need to generate or assemble GPU code dynamically while still preserving architecture-specific variants. nvFatbin provides API-level control over building those fatbins without relying only on offline toolchain steps.

Key capabilities

Runtime creation of CUDA fat binaries.
Inputs can include device cubins, PTX-ISA, or LTO-IR.
Output can be loaded with Driver API module-loading routines.
Useful for applications that want architecture-specific optimized variants for Hopper, Blackwell, or other GPUs.

NVIDIA context

nvFatbin complements NVRTC, nvJitLink, and PTX-Compiler-APIs in dynamic GPU-code generation systems. It is especially relevant to frameworks, DSLs, inference runtimes, and plugin systems that compile or specialize GPU kernels at runtime.

Connections

CUDA-Driver-API - loads fatbins created by nvFatbin.
NVRTC - can generate PTX inputs for runtime packaging.
nvJitLink - handles runtime device-code linking before or alongside packaging flows.
PTX-ISA - PTX can be one of the input forms.
NVCC - offline compilation still produces related CUDA binary artifacts.

Source Excerpts

NVIDIA’s nvFatbin guide describes runtime fatbin creation for multiple CUDA source variants.

AIPS BOOM

Explorer

nvFatbin

nvFatbin

Summary

Detail

Purpose

Key capabilities

NVIDIA context

Connections

Source Excerpts

Graph View

Table of Contents

Backlinks