NVIDIA GB200 NVL4

Type: Platform Tags: NVIDIA, GPU, Hardware, NVLink, Blackwell, Grace, Data Center, Single-Node, AI Related: NVIDIA-GB200-NVL72, NVIDIA-Blackwell-Architecture, NVLink, NVIDIA-Grace-CPU, NVIDIA-HGX, NVIDIA-DGX, NCCL Sources: ServeTheHome, Tweaktown, Tom’s Hardware, VideoCardz — November 2024 coverage of NVIDIA announcement Last Updated: 2026-04-10

Summary

The NVIDIA GB200 NVL4 is a single-server Grace Blackwell configuration combining four Blackwell B200 GPUs and two Grace CPUs on an enlarged motherboard, connected via fifth-generation NVLink. It targets data centers that need Blackwell-generation AI performance without the full rack-scale footprint of the NVL72. It delivers up to 1.3 TB of coherent shared memory and ~6 kW system power — positioned as the entry point into the GB200 ecosystem.

Detail

Purpose

Bridges the gap between single-GPU or dual-GPU Blackwell deployments and the full rack-scale NVL72. Enables organizations to deploy Blackwell AI compute in a standard server form factor without liquid-cooling infrastructure or full-rack commitment.

Key Specifications

GPUs: 4x NVIDIA Blackwell B200 GPUs
CPUs: 2x NVIDIA Grace CPUs (Arm Neoverse V2, 72 cores each — 144 total)
Coherent memory: Up to 1.3 TB (HBM3E GPU memory + Grace LPDDR5X unified via NVLink-C2C)
NVLink bandwidth: 1.8 TB/s bidirectional per GPU (5th-gen NVLink)
System power: ~6 kW (full server with NICs, SSDs, and components)
Form factor: Single server, enlarged motherboard (4-way NVLink domain)
Availability: H2 2025, via OEM partners (MSI, ASUS, Gigabyte, Lenovo, HPE, Wistron, Pegatron, ASRock Rack)

Performance vs. Prior Generation (GH200 NVL4)

2.2x simulation performance
1.8x training performance
1.8x inference performance

Key Features

4-way NVLink domain — all four B200 GPUs fully connected via 5th-gen NVLink
NVLink-C2C connects Grace CPUs to B200 GPUs at 900 GB/s per link
Unified coherent 1.3 TB memory pool across CPUs and GPUs
Single-node deployment — no NVLink Switch required (unlike NVL72)
Standard server chassis compatible — no dedicated liquid-cooling rack needed

How It Differs from NVL72

	NVL4	NVL72
GPUs	4x B200	72x B200
CPUs	2x Grace	36x Grace
Memory	~1.3 TB	13.4 TB
NVLink	4-way domain	72-way all-to-all
Form factor	Single server	Full rack (liquid-cooled)
NVLink Switch	Not required	Required
Power	~6 kW	Rack-scale

Use Cases

Mid-scale LLM inference (models that fit within 1.3 TB)
Enterprise AI deployment without full rack infrastructure
HPC simulation workloads at departmental scale
Development and fine-tuning of large models
Organizations stepping into Blackwell without full NVL72 commitment

Target Customers

Enterprises, research institutions, and cloud providers wanting Blackwell-generation AI in a standard data center footprint. OEM ecosystem: Lenovo, HPE, ASUS, MSI, Gigabyte, ASRock Rack, Wistron, Pegatron.

Connections

NVIDIA-GB200-NVL72 — the rack-scale 72-GPU counterpart; NVL4 is the single-node entry point
NVIDIA-Blackwell-Architecture — built on B200 Blackwell GPUs
NVIDIA-Grace-CPU — 2x Grace CPUs unified with GPUs via NVLink-C2C
NVLink — 5th-gen NVLink connects all 4 GPUs at 1.8 TB/s per GPU
NVIDIA-HGX — alternative 8-GPU SXM baseboard for OEM servers (x86 CPU)
NVIDIA-DGX — NVIDIA’s turnkey complete system; NVL4 is OEM/ODM-based

AIPS BOOM

Explorer

NVIDIA-GB200-NVL4