NVIDIA Augmented Reality SDK

Type: SDK Tags: NVIDIA, AR SDK, augmented reality, face tracking, body pose, eye contact, lip sync, active speaker, Maxine Related: NVIDIA-AI-for-Media-SDKs, NVIDIA-Maxine, NIM-for-Maxine-Eye-Contact, NIM-for-Maxine-Active-Speaker-Detection, NIM-for-Maxine-Audio2Face-2D, NIM-for-Audio2Face-3D, NVIDIA-Video-Effects-SDK, NVIDIA-Triton-AR-VFX-SDKs, NVIDIA-RTX, Triton-Inference-Server Sources: https://docs.nvidia.com/maxine/ar/latest/index.html, https://docs.nvidia.com/nim/maxine/eye-contact/latest/overview.html, https://docs.nvidia.com/nim/maxine/active-speaker-detection/latest/overview.html, https://docs.nvidia.com/nim/maxine/audio2face-2d/latest/overview.html Last Updated: 2026-04-29

Summary

NVIDIA Augmented Reality SDK (AR SDK) provides AI-powered real-time modeling and tracking of human faces and bodies from video. Current docs list face detection/tracking, facial landmarks, 3D body pose, eye contact, expression estimation, lip sync, and active speaker detection; current Maxine NIM docs expose several adjacent capabilities as deployable services.

Detail

Purpose

AR and video applications often need real-time human understanding from camera streams. The AR SDK packages GPU-accelerated models and APIs for face/body tracking, gaze, expressions, and speech-related visual cues.

Current features

Deployment notes

The current docs include Windows and Linux installation paths, sample applications, performance references, container use, GPU/CPU buffer handling, multiple GPU support, and API references.

Connections

Source Excerpts

  • NVIDIA’s AR SDK docs describe AI-powered face/body modeling and tracking with features such as landmarks, body pose, eye contact, lip sync, and active speaker detection.

Resources