Minimizing Deep Learning Inference Latency with NVIDIA Multi-Instance GPU

1 · NVIDIA Corporation · Dec. 18, 2020, 6:46 p.m.
Recently, NVIDIA unveiled the A100 GPU model, based on the NVIDIA Ampere architecture. Ampere introduced many features, including Multi-Instance GPU (MIG), that play a special role for deep learning-based (DL) applications. MIG makes it possible to use a single A100 GPU as if it were multiple smaller GPUs, maximizing utilization for DL workloads and providing […]...