Nvidia Rubin CPX forms one half of new, "disaggregated" AI inference architecture — approach splits work between compute- and bandwidth-optimized chips for best performance

1 month ago 54

(Image credit: Nvidia)

Nvidia has announced its new Rubin CPX GPU today, a "purpose-built GPU designed to meet the demands of long-context AI workloads." The Rubin CPX GPU, not to be confused with a plain Rubin GPU, is an AI accelerator/GPU focused on maximizing the inference performance of the upcoming Vera Rubin NVL144 CPX rack.

As AI workloads evolve, the computing architectures designed to power them are evolving in tandem. Nvidia's new strategy for boosting inference, termed disaggregated inference, relies on multiple distinct types of GPUs working in tandem to reach peak performance. Compute-focused GPUs will handle what it calls the "context phase," while different chips focused on memory bandwidth will handle the throughput-intensive "generation phase."

The choice to include GDDR7 on the rather than HBM4 is also one of optimization. As mentioned, disaggregated inference workflows will split the inference process between the Rubin and Rubin CPX GPUs. Once the compute-optimized Rubin CPX has built the context for a task, for which the performance parameters of GDDR7 are sufficient, it will then pass the ball to a Rubin GPU for the generation phase, which benefits from the use of high-bandwidth memory.

Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.

Rubin CPX will be available inside Nvidia's Vera Rubin NVL144 CPX rack, coming with Vera Rubin in 2026. The rack, which will contain 144 Rubin GPUs, 144 Rubin CPX GPUs, 36 Vera CPUs, 100 TB of high-speed memory, and 1.7 PB/s of memory bandwidth, is slated to produce 8 exaFLOPs NVFP4. This is 7.5x higher performance than the current-gen GB300 NVL72, and beats out the 3.6 exaFLOPs of the base Vera Rubin NVL144 without CPX.

Sunny Grimm is a contributing writer for Tom's Hardware. He has been building and breaking computers since 2017, serving as the resident youngster at Tom's. From APUs to RGB, Sunny has a handle on all the latest tech news.

Read Entire Article