nvComp

Type: Technology Tags: CUDA, NVIDIA, GPU, Compression, Data Processing, HPC, AI, Storage Related: GPU-Direct-Storage, cuDF, Thrust, NVIDIA-DALI Sources: NVIDIA official documentation Last Updated: 2026-04-09

Summary

nvComp is NVIDIA’s GPU-accelerated data compression and decompression library, optimized for high-speed data handling in AI training, HPC, data science, and analytics workloads. Starting with version 4.2, it leverages the dedicated Decompression Engine on Blackwell GPUs for up to 600 GB/s throughput. It supports standard formats (LZ4, Snappy, ZSTD, Deflate) as well as GPU-native formats (Bitcomp, GDeflate, gANS, Cascaded).

Detail

Purpose

Modern data-intensive workloads (AI training, HPC simulations, data analytics) require moving and processing enormous volumes of data. nvComp accelerates compression and decompression directly on the GPU, reducing storage footprint, improving I/O bandwidth utilization, and enabling in-kernel compression via nvCOMPDX — eliminating CPU-GPU transfer overhead.

Key Features

GPU-accelerated compression and decompression
Standard algorithm support: LZ4, Snappy, ZSTD, Deflate
GPU-optimized formats: Bitcomp, GDeflate, gANS, Cascaded
Blackwell Decompression Engine support (v4.2+): up to 600 GB/s throughput
Fused copy-decompress operations for reduced latency
nvCOMPDX: device-side (in-kernel) compression extensions for C/C++
Python APIs for integration with PyTorch and TensorFlow
Optimized memory bandwidth utilization and reduced storage overhead

Use Cases

AI training data pipeline acceleration (decompress directly to GPU)
HPC checkpoint compression and decompression
Data analytics over compressed columnar data (Parquet, ORC)
GPU communication compression for distributed training
Storage-efficient model checkpointing
Real-time database query acceleration

Hardware Requirements

NVIDIA GPU with CUDA support
Blackwell (B100/B200) for dedicated Decompression Engine (v4.2+)
Earlier GPUs (Ampere, Hopper) supported with software-based compression
Python APIs compatible with major ML frameworks

Language Bindings

Python (primary user-facing API)
C/C++ (nvCOMPDX for in-kernel compression)

Connections

GPU-Direct-Storage — nvComp and GDS together enable compressed data to flow directly from storage to GPU without CPU involvement
cuDF — cuDF uses nvComp for reading and writing compressed Parquet/ORC files
Thrust — nvComp leverages Thrust for parallel data processing primitives
NVIDIA-DALI — DALI data pipelines can use nvComp for compressed image/video loading

AIPS BOOM

Explorer

nvComp

nvComp

Summary

Detail

Purpose

Key Features

Use Cases

Hardware Requirements

Language Bindings

Connections

Resources

Graph View

Table of Contents

Backlinks