Nvidia GB200 NVL72

Nvidia GB200 NVL72

 

Unlocking Real-Time Trillion-Parameter Models

GB200 NVL72 connects 36 Grace CPUs and 72 Blackwell GPUs in a rack-scale, liquid-cooled design. It boasts a 72-GPU NVLink domain that acts as a single, massive GPU and delivers 30X faster real-time trillion-parameter large language model (LLM) inference.

The GB200 Grace Blackwell Superchip is a key component of the NVIDIA GB200 NVL72, connecting two high-performance NVIDIA Blackwell Tensor Core GPUs and an NVIDIA Grace™ CPU using the NVIDIA NVLink™-C2C interconnect to the two Blackwell GPUs.

Specification GB200 NVL72 GB200 Grace Blackwell Superchip
Configuration 36 Grace CPU : 72 Blackwell GPUs 1 Grace CPU : 2 Blackwell GPU
FP4 Tensor Core 1,440 PFLOPS 40 PFLOPS
FP8/FP6 Tensor Core 720 PFLOPS 20 PFLOPS
INT8 Tensor Core 720 POPS 20 POPS
FP16/BF16 Tensor Core 360 PFLOPS 10 PFLOPS
TF32 Tensor Core 180 PFLOPS 5 PFLOPS
FP32 5,760 TFLOPS 160 TFLOPS
FP64 2,880 TFLOPS 80 TFLOPS
FP64 Tensor Core 2,880 TFLOPS 80 TFLOPS
GPU Memory | Bandwidth Up to 13.4 TB HBM3e | 576 TB/s Up to 372GB HBM3e | 16 TB/s
NVLink Bandwidth 130 TB/s 3.6 TB/s
CPU Core Count 2,592 Arm® Neoverse V2 cores 72 Arm Neoverse V2 cores
CPU Memory | Bandwidth Up to 17 TB LPDDR5X | Up to 18.4 TB/s Up to 480GB LPDDR5X | Up to 512 GB/s