cuSOLVERMp

Type: Technology Tags: NVIDIA, CUDA, cuSOLVER, distributed linear algebra, dense solvers, eigenvalues, HPC Related: cuSOLVER, cuBLASMp, NCCL, NVSHMEM, NVIDIA-DGX-SuperPOD, NVIDIA-CUDA Sources: https://docs.nvidia.com/cuda/cusolvermp/index.html Last Updated: 2026-04-29

Summary

cuSOLVERMp is NVIDIA’s distributed-memory, GPU-accelerated dense linear solver and eigensolver library. It provides ScaLAPACK-like C APIs for solving dense linear systems and eigenvalue problems using distributed 2D block-cyclic data layouts.

Detail

Purpose

Large dense linear systems can exceed the memory or performance envelope of a single GPU or process. cuSOLVERMp brings NVIDIA GPU acceleration to distributed dense solver workflows common in HPC, engineering simulation, and scientific computing.

Key capabilities

  • Distributed dense linear-system and eigenvalue solvers.
  • 2D block-cyclic data layout compatibility.
  • ScaLAPACK-like C APIs for easier mapping from traditional HPC workflows.
  • Availability through NVIDIA Developer Zone, NVIDIA HPC SDK, PyPI for CUDA 12/13, and conda.
  • Designed for multi-process, multi-GPU environments.

NVIDIA context

cuSOLVERMp complements cuBLASMp in the distributed dense linear algebra stack and connects CUDA-X math libraries to large NVIDIA GPU clusters.

Connections

  • cuSOLVER - single-node solver library that cuSOLVERMp extends to distributed-memory settings.
  • cuBLASMp - companion distributed dense BLAS library.
  • NVIDIA-DGX-SuperPOD - representative scale-out platform for distributed solvers.
  • NCCL - common communication layer in multi-GPU systems.
  • NVPL - CPU-side math libraries for Grace can coexist with GPU distributed solvers.

Source Excerpts

  • NVIDIA describes cuSOLVERMp as a distributed-memory GPU library for dense linear systems and eigenvalue problems.