Nvidia has announced its new Rubin CPX GPU today, a “purpose-built GPU designed to meet the demands of long-context AI workloads.” The Rubin CPX GPU, not to be confused with a plain Rubin GPU, is an AI accelerator/GPU focused on maximizing the inference performance of the upcoming Vera Rubin NVL144 CPX rack.

As AI workloads evolve, the computing architectures designed to power them are evolving in tandem. Nvidia’s new strategy for boosting inference, termed disaggregated inference, relies on multiple distinct types of GPUs working in tandem to reach peak performance. Compute-focused GPUs will handle what it calls the “context phase,” while different chips focused on memory bandwidth will handle the throughput-intensive “generation phase.”


Source: Latest from Tom’s Hardware.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

The reCAPTCHA verification period has expired. Please reload the page.