Red Hat AI Factory with NVIDIA

Type: Deployment Guide Tags: NVIDIA, Red Hat AI Factory, AI Enterprise, OpenShift AI, NIM, Kubernetes, AI factory, OpenShift, agentic AI Related: NVIDIA-AI-Enterprise, NVIDIA-AI-Enterprise-Lifecycle-Policy, NVIDIA-AI-Enterprise-Bare-Metal-Deployment, NVIDIA-AI-Enterprise-Cloud-Deployment, NVIDIA-Enterprise-AI-Factory, NVIDIA-Enterprise-Reference-Architectures, NVIDIA-AI-Enterprise-Software-Reference-Architecture, NVIDIA-AI-Software-for-Regulated-Environments, NVIDIA-AI-Enterprise-Security, NVIDIA-RTX-PRO-AI-Factory, NVIDIA-NIM, NIM-for-Large-Language-Models, NVIDIA-NIM-Operator, NVIDIA-GPU-Operator, NVIDIA-Network-Operator, NVIDIA-Dynamo, NIXL, vLLM, TensorRT-LLM, NVIDIA-Agent-Intelligence-Toolkit, NeMo-Platform, NeMo-Retriever, NVIDIA-AI-Blueprints, NVIDIA-AI-Q-Blueprint, Nemotron, NVIDIA-Cosmos, NVIDIA-Certified-Systems, NVIDIA-Certified-Storage, NVIDIA-Spectrum-X, NVIDIA-BlueField-DPU, NVIDIA-DOCA, GPUDirect-RDMA, GPU-Direct-Storage Sources: https://docs.nvidia.com/ai-enterprise/deployment/red-hat-ai-factory/latest/index.html, https://docs.nvidia.com/ai-enterprise/deployment/red-hat-ai-factory/latest/platform-overview.html, https://docs.nvidia.com/ai-enterprise/deployment/red-hat-ai-factory/latest/overview.html, https://docs.nvidia.com/ai-enterprise/deployment/red-hat-ai-factory/latest/prerequisites.html, https://docs.nvidia.com/ai-enterprise/deployment/red-hat-ai-factory/latest/software-overview.html, https://docs.nvidia.com/ai-enterprise/deployment/red-hat-ai-factory/latest/network-operator.html, https://docs.nvidia.com/ai-enterprise/deployment/red-hat-ai-factory/latest/gpu-operator.html, https://docs.nvidia.com/ai-enterprise/deployment/red-hat-ai-factory/latest/deploy-ai-workloads-nim-operator.html, https://docs.nvidia.com/ai-enterprise/deployment/red-hat-ai-factory/latest/deploy-nvidia-nim-redhat.html Last Updated: 2026-04-29

Summary

Red Hat AI Factory with NVIDIA is NVIDIA’s deployment guide for a co-engineered enterprise AI factory stack that combines NVIDIA-AI-Enterprise with Red Hat OpenShift AI. It is a production deployment pattern for running NVIDIA-accelerated AI workloads on OpenShift, from GPU and networking operator setup through NIM model serving, OpenShift AI integration, and Gen AI Studio experimentation.

Detail

Purpose

This page is the canonical wiki page for the current NVIDIA-authored Red Hat AI Factory with NVIDIA guide. It should stay as one solution page rather than splitting the deployment steps into separate wiki pages. Use the linked component pages for details on NVIDIA-NIM, NVIDIA-NIM-Operator, NVIDIA-GPU-Operator, NVIDIA-Network-Operator, NVIDIA-Dynamo, NIXL, and NVIDIA-AI-Enterprise.

Stack

  • NVIDIA AI Enterprise: application and infrastructure software for production AI, including NIM, NeMo-family tooling, NGC-delivered assets, data center drivers, DOCA drivers, GPU Operator, Network Operator, NIM Operator, DPU/DPF operator paths, and Base Command Manager context.
  • Red Hat OpenShift and OpenShift AI: Kubernetes and MLOps control plane for projects, dashboards, model serving, Gen AI Studio, and Playground-style experimentation against hosted NIM-backed models.
  • NVIDIA infrastructure: NVIDIA-Certified-Systems with supported NVIDIA AI Enterprise GPUs, NVIDIA networking such as NVIDIA-Spectrum-X Ethernet or Quantum InfiniBand, and NVIDIA-Certified-Storage or supported dynamic storage classes.
  • NIM deployment paths: Helm-based LLM NIM deployment, NVIDIA-NIM-Operator deployment with NIMCache, NIMService, and NIMPipeline, KServe integration, and OpenShift AI model registration.
  • Scale-out inference: Red Hat AI Inference Server and NVIDIA NIM can use engines such as vLLM, TensorRT-LLM, or SGLang. The guide also positions llm-d with NVIDIA-Dynamo and NIXL for distributed inference patterns.
  • Networking: NVIDIA-Network-Operator is optional for clusters without NVIDIA networking devices, but it becomes important for high-speed multi-node inference, GPUDirect-style workloads, RDMA, and large distributed serving deployments.
  • Security and operations: the solution combines Red Hat platform hardening with NVIDIA production-branch software, NGC authentication, possible air-gapped deployment patterns, BlueField/DOCA infrastructure services, and OpenShift-native lifecycle management.

NVIDIA context

The guide belongs in the same graph as NVIDIA-Enterprise-AI-Factory and the current Enterprise RA pages, but it is specifically the Red Hat/OpenShift deployment track. It is narrower than the strategic AI factory design guide and more platform-specific than the generic NVIDIA-AI-Enterprise-Software-Reference-Architecture.

Connections

Source Excerpts

  • NVIDIA describes the guide as a co-engineered Red Hat AI Factory deployment that integrates NVIDIA AI Enterprise with Red Hat OpenShift AI.
  • The prerequisites emphasize NVIDIA-Certified Systems, supported NVIDIA AI Enterprise GPUs, NVIDIA networking, and NVIDIA Certified Storage.
  • The deployment sections show NIM on OpenShift through Helm, NIM Operator custom resources, KServe, and OpenShift AI/Gen AI Studio integration.
  • The networking section notes that GPUDirect with RDMA can help large distributed inference workloads even when it is not strictly required.

Resources