Nvidia Rubin
Nvidia
Follow ZDNET: Add us as a preferred source on Google.
ZDNET’s key takeaways
- Nvidia’s new platform aims to reduce the cost of training LLMs.
- It uses six AI chips to lower token costs and GPU requirements.
- The first platforms will roll out to partners later in the year.
The last several years have been stupendous for Nvidia. When generative AI became all the rage, demand for the tech giant’s hardware skyrocketed as companies and developers scrambled for its graphics cards to train their large language models (LLMs). During CES 2026, Nvidia held a press conference to unveil its latest innovation in the AI space: the Rubin platform.
Also: CES 2026: Biggest TV, smart glasses, phone news, and more we’ve seen so far
Nvidia announced what the technology can do, and it’s all pretty dense, so to keep things concise, I’m only focusing on the highlights.
Rubin is an AI supercomputing platform designed to make “building, deploying, and securing the world’s largest and most advanced AI systems at the lowest cost” possible. According to Nvidia, the platform can deliver up to a 10x reduction in inference token costs and requires four times fewer graphics cards to train mixture-of-experts (MoE) models compared to the older Blackwell platform.
The easiest way to think about Nvidia Rubin is to imagine Blackwell, but on a much grander scale.
Nvidia CES 2026
Cesar Cadenas/ZDNET
The goal with Rubin is to accelerate mainstream adoption of advanced AI models, particularly in the consumer space. One of the biggest hurdles holding back widespread adoption of LLMs is cost. As models grow larger and more complex, the hardware and infrastructure required to train and support the models become astronomically expensive. By sharply reducing those token costs via Rubin, Nvidia hopes to make large-scale AI deployment more practical.
Also: Nvidia’s physical AI models clear the way for next-gen robots – here’s what’s new
Nvidia said that it used an “extreme codesign” approach when developing the Rubin platform, creating a single AI supercomputer made up of six integrated chips. At the center is an Nvidia Vera CPU, an energy-efficient processor for large-scale AI factories, built with 88 custom Olympus cores, full Armv9.2 compatibility, and fast NVLink-C2C connectivity to deliver high performance.
Jensen Huang at CES 2026
Sabrina Ortiz/ZDNET
Working alongside the CPU is the Nvidia Rubin GPU, serving as the platform’s primary workhorse. Sporting a third-generation Transform Engine, it is capable of delivering up to 50 petaflops of NVFP4 computational power. Connecting everything together is the Nvidia NVLink 6 Switch, enabling ultra-fast GPU-to-GPU communication. Nvidia’s ConnectX-9 SuperNIC handles high-speed networking, while the Bluefield-4 DPU offloads some of the workload from the CPU and GPU so they focus more on AI models.
Rounding everything out is the company’s Spectrum-6 Ethernet switch to provide next-gen networking for AI data centers.
Also: The most exciting AI wearable at CES 2026 might not be smart glasses after all
The Rubin will be available in multiple configurations, such as the Nvidia Vera Rubin NVL72. This combines 36 Nvidia Vera CPUs, 72 Nvidia Rubin GPUs, an Nvidia NVLink 6 switch, multiple Nvidia ConnectX-9 SuperNICs, and Nvidia BlueField-4 DPUs.
Judging from all the news, I don’t think these supercomputing platforms will be something that the average person can buy from Best Buy. Nvidia said that the first of these Rubin platforms will roll out to partners sometime in the second half of 2026. Among the first will be Amazon Web Services, Google Cloud, and Microsoft. If Nvidia’s gamble pays off, these computers could usher in a new era of AI computing where scale is much more manageable.