Dynamo Profiler

Type: Tool Tags: NVIDIA, Dynamo, Profiler, AI Configurator, AIPerf, LLM inference, TensorRT-LLM, vLLM, SGLang, SLA Related: NVIDIA-Dynamo, Dynamo-Planner, Dynamo-Disaggregated-Serving, Dynamo-KV-Cache-Aware-Routing, Dynamo-KV-Block-Manager, NVIDIA-AIPerf, TensorRT-LLM, vLLM, NVIDIA-Grove Sources: https://docs.nvidia.com/dynamo/latest/components/profiler, https://docs.nvidia.com/dynamo/latest/user-guides/benchmarking, https://docs.nvidia.com/dynamo/latest/getting-started/introduction Last Updated: 2026-04-29

Summary

Dynamo Profiler is NVIDIA Dynamo’s automated performance analysis component for measuring model inference characteristics and producing deployment recommendations. It determines prefill/decode tensor-parallel settings, generates performance data for Dynamo-Planner, and supports both online profiling with NVIDIA-AIPerf and faster offline estimation through AI Configurator.

Detail

Purpose

Disaggregated and autoscaled LLM deployments need configuration data before they can reliably meet latency and throughput goals. Dynamo Profiler analyzes model, workload, backend, hardware, and SLA assumptions so the stack can choose prefill/decode parallelism and generate deployment manifests instead of relying on manual sweeps.

Current scope

  • Dense model profiling across SGLang, TensorRT-LLM, and vLLM.
  • MoE model profiling support as documented by current feature matrices.
  • AI Configurator offline profiling for fast estimated configurations.
  • Online profiling through AIPerf for higher-accuracy measurements.
  • Interactive WebUI support in current docs.
  • Generates recommended tensor-parallel settings for prefill and decode engines.
  • Produces performance interpolation data for Planner and generated DynamoGraphDeployment manifests.

NVIDIA context

Profiler is the measurement and configuration-discovery layer inside NVIDIA-Dynamo. It feeds Dynamo-Planner, informs Dynamo-Disaggregated-Serving layouts, pairs with NVIDIA-AIPerf for online benchmarking, and supports backend choices such as TensorRT-LLM and vLLM.

Connections

Source Excerpts

  • NVIDIA docs describe Dynamo Profiler as an automated performance analysis tool for optimizing deployment configurations.
  • Current docs list online AIPerf profiling and offline AI Configurator profiling methods.

Resources