NVIDIA RAG Blueprint

Type: Platform Tags: NVIDIA, AI Blueprint, RAG, retrieval augmented generation, NeMo Retriever, NIM, multimodal RAG, enterprise search Related: NVIDIA-AI-Blueprints, NVIDIA-NIM, NVIDIA-NIM-Operator, NVIDIA-AI-Q-Blueprint, NVIDIA-AI-Data-Platform, NeMo-Retriever, NeMo-Evaluator, NeMo-Retriever-Embedding-NIM, Llama-Nemotron-Embed-1B-v2, Llama-Nemotron-Embed-VL-1B-v2, NIM-for-NV-CLIP, NeMo-Retriever-Reranking-NIM, Llama-Nemotron-Rerank-1B-v2, Llama-Nemotron-Rerank-VL-1B-v2, NIM-for-Image-OCR, NIM-for-Object-Detection, Nemotron-Parse, NIM-for-Vision-Language-Models, Nemotron-3-Nano-Omni, NVIDIA-NemoGuard-NIMs, Nemotron, cuVS, NVIDIA-AI-Enterprise, NVIDIA-Enterprise-Reference-Architectures Sources: https://docs.nvidia.com/rag/latest/, https://docs.nvidia.com/rag/latest/vlm-embed.html, https://docs.nvidia.com/rag/latest/multimodal-query.html, https://docs.nvidia.com/nemo/microservices/latest/evaluator/index.html, https://docs.nvidia.com/nim/vision-language-models/latest/examples/nemotron-parse/api.html, https://docs.nvidia.com/nim/vision-language-models/latest/examples/nemotron-3-nano-omni-30b-a3b-reasoning/api.html Last Updated: 2026-04-29

Summary

NVIDIA RAG Blueprint is NVIDIA’s current reference workflow for building retrieval augmented generation applications with NVIDIA NIM, NeMo Retriever, vector search, document ingestion, multimodal retrieval, evaluation, guardrails, and observability. The latest docs cover Docker, Kubernetes/Helm, NIM Operator, retrieval-only mode, Python package usage, multimodal embedding, VLM-based generation, and NVIDIA-hosted or self-hosted model endpoints.

Detail

Purpose

Enterprise RAG systems need more than a vector database and an LLM. NVIDIA RAG Blueprint provides an end-to-end NVIDIA-authored architecture for ingesting enterprise documents, extracting text and visual structure, embedding and indexing content, retrieving relevant context, generating answers, and evaluating or governing the pipeline.

Current scope

NVIDIA context

This page is the canonical wiki target for the durable RAG blueprint, not for every subpage under the RAG docs tree. It belongs between NeMo-Retriever, NVIDIA-AI-Q-Blueprint, NVIDIA-AI-Data-Platform, and NVIDIA-AI-Blueprints because those pages describe the services, agents, data architecture, and blueprint catalog around enterprise retrieval.

Connections

Source Excerpts

  • NVIDIA docs describe RAG Blueprint as documentation for getting started, customizing, and troubleshooting the RAG workflow.
  • Current multimodal query docs describe querying a knowledge base with both text and images.

Resources