NVIDIA DGX SuperPOD B200 Reference Architecture
Type: Reference Architecture Tags: NVIDIA, DGX SuperPOD, DGX B200, Blackwell, NDR InfiniBand, Spectrum-X, Mission Control, Base Command Manager, AI factory Related: NVIDIA-DGX-SuperPOD, NVIDIA-DGX-B200, NVIDIA-DGX, NVIDIA-DGX-BasePOD, NVIDIA-DGX-SuperPOD-GB200-RA, NVIDIA-DGX-B300, NVIDIA-Blackwell-Architecture, NVIDIA-Quantum-InfiniBand, NVIDIA-Spectrum-X, NVIDIA-Mission-Control, NVIDIA-Base-Command-Manager, NVIDIA-Run-ai, NVIDIA-AI-Enterprise, NGC, NVIDIA-UFM, NVIDIA-Certified-Storage, NVIDIA-Enterprise-AI-Factory Sources: https://docs.nvidia.com/dgx-superpod/reference-architecture-scalable-infrastructure-b200/latest/index.html, https://docs.nvidia.com/dgx-superpod/reference-architecture-scalable-infrastructure-b200/latest/abstract.html, https://docs.nvidia.com/dgx-superpod/reference-architecture-scalable-infrastructure-b200/latest/dgx-superpod-components.html, https://docs.nvidia.com/dgx-superpod/reference-architecture-scalable-infrastructure-b200/latest/network-fabrics.html, https://docs.nvidia.com/dgx-superpod/reference-architecture-scalable-infrastructure-b200/latest/dgx-software.html Last Updated: 2026-05-09
Summary
NVIDIA DGX SuperPOD B200 Reference Architecture is the NVIDIA DGX SuperPOD design for Blackwell-generation NVIDIA-DGX-B200 systems. It uses 32-DGX-system scalable units, NDR 400 Gb/s InfiniBand compute networking, optional Spectrum-X/Spectrum-4 Ethernet storage networking, high-performance storage, NVIDIA-Mission-Control, NVIDIA-Base-Command-Manager, NVIDIA-Run-ai, NVIDIA-AI-Enterprise, NGC, CUDA, Magnum IO, Slurm, and NVIDIA Enterprise Experience support.
Detail
Purpose
Use this page for the NVIDIA document titled “NVIDIA DGX SuperPOD: Next Generation Scalable Infrastructure for AI Leadership Reference Architecture Featuring NVIDIA DGX B200.” Use NVIDIA-DGX-B200 for the single-system identity and NVIDIA-DGX-SuperPOD for the generic SuperPOD family.
Architecture notes
- Document number RA-11334-001 V08; publication date 2025-04-24; docs last updated 2025-11-19.
- The RA is based on NVIDIA-DGX-B200 systems with eight B200 GPUs per system.
- The design uses scalable units; each SU contains 32 DGX B200 systems.
- The fully tested system scales to four SUs, with larger deployments possible based on customer requirements.
- Each rack can support up to two DGX B200 systems, allowing the rack layout to be adapted to data center constraints.
- Compute fabric uses NVIDIA Quantum QM9700 NDR 400 Gb/s InfiniBand in a rail-optimized, balanced, full-fat-tree topology with eight NDR400 connections per system.
- Storage fabric can be InfiniBand or RDMA over Converged Ethernet; the RA lists Spectrum-X SN5600 800 Gb/s Ethernet as the optional Ethernet storage fabric.
- Storage planning separates high-performance storage from user storage, with high-performance storage targeted at parallel I/O/checkpoint workflows and user storage typically exposed through NFS on the in-band management fabric.
Software stack
The software stack includes NVIDIA-Mission-Control, NVIDIA-Base-Command-Manager, NVIDIA-Run-ai, NVIDIA-AI-Enterprise, Magnum IO, NGC, and Slurm. The current docs note that Mission Control includes Base Command Manager and Run:ai functionality.
NVIDIA context
This is the B200 SuperPOD bridge between NVIDIA-DGX-BasePOD scale and later rack-scale NVIDIA-DGX-SuperPOD-GB200-RA / NVIDIA-DGX-SuperPOD-B300-Spectrum-4-Ethernet-RA / NVIDIA-DGX-SuperPOD-B300-Quantum-X800-InfiniBand-RA designs. It is important for customers comparing Blackwell-era DGX systems before deciding whether they need B200 nodes, GB200 rack-scale NVLink domains, or B300/GB300 Blackwell Ultra systems.
Connections
- NVIDIA-DGX-SuperPOD - parent SuperPOD architecture family.
- NVIDIA-DGX-B200 - DGX compute system used by this RA.
- NVIDIA-DGX - broader DGX systems family.
- NVIDIA-DGX-BasePOD - smaller enterprise reference architecture that can precede SuperPOD deployments.
- NVIDIA-DGX-SuperPOD-GB200-RA - next rack-scale Grace Blackwell SuperPOD RA.
- NVIDIA-DGX-B300 - Blackwell Ultra successor generation.
- NVIDIA-Blackwell-Architecture - GPU architecture foundation.
- NVIDIA-Quantum-InfiniBand - NDR InfiniBand compute and storage fabric context.
- NVIDIA-Spectrum-X - optional Ethernet/RoCE storage fabric direction.
- NVIDIA-Mission-Control - integrated operations and workload plane for current SuperPOD software.
- NVIDIA-Base-Command-Manager and NVIDIA-Run-ai - management and orchestration capabilities included through Mission Control.
- NVIDIA-AI-Enterprise - enterprise AI software layer.
- NGC - GPU-optimized containers and software catalog.
- NVIDIA-UFM - InfiniBand fabric management and telemetry.
- NVIDIA-Certified-Storage - storage partner validation is part of the SuperPOD design.
- NVIDIA-Enterprise-AI-Factory - the RA is a practical AI factory build pattern.
Source Excerpts
- NVIDIA describes the B200 RA as a DGX SuperPOD architecture for AI, HPC, and hybrid workloads.
- The abstract says each scalable unit contains 32 DGX B200 systems.
- The key-components page lists NDR 400 Gb/s InfiniBand compute fabric, optional Spectrum-X SN5600 Ethernet storage fabric, Mission Control, Base Command Manager, Run:ai, AI Enterprise, NGC, Magnum IO, and Slurm.