Nvidia Unveils New AI Chip Set to Cut Cost of Running LLMs

UTC by Mayowa Adebajo · 3 min read
Nvidia Unveils New AI Chip Set to Cut Cost of Running LLMs
Photo: NVIDIS / Facebook

GPUs have recently become the first-choice chips for large AI models that support generative AI software.

Nvidia appears to have turned up the heat on its competitors within the artificial intelligence (AI) hardware space. This follows after the firm unveiled its latest AI chip on Tuesday, in a move that may just have set it apart from its rivals, which include AMD, Amazon, and Google.

According to a BBC report, Nvidia currently dominates the AI chips market with a market share that tops 80%. And that may be linked to its special forte in making graphics processing units (GPUs), which have recently become the first-choice chips for large AI models that support generative AI software. Examples of such models are OpenAI’s ChatGPT and Google’s Bard.

With the huge demands from tech giants, cloud providers, and startups alike, Nvidia chips have inadvertently come up short in supply. Therefore, it is only understandable that Nvidia is coming out with its new chip – the GH200 – at this time.

Nvidia Shares Details About the New AI GH200 Chip

According to the company, the new chip has a similar GPU to its current most expensive AI chip, the H100. But there is an improvement in the area of memory and processor. The GH200 will come with 141 gigabytes of memory as opposed to the H100’s 80GB of memory. It also has a 72-core ARM central processor, says Nvidia.

Speaking about the new chip during a Tuesday conference, Nvidia CEO Jensen Huang noted that “the processor is designed for the scale-out of the world’s data centres.” So, the company set out to give the processor a boost right from the onset.

Meanwhile, AI models often work in two parts; the training and inference parts. An AI model must first be trained extensively for months, even at the expense of thousands of GPUs. It is then used in software to generate content in the inference stage.

However, inference is usually expensive and requires a lot of processing power. This cost of running the software is what Nvidia has targeted to reduce with the new GH200, Huang disclosed. And, considering that it has more memory capacity, it is only safe to conclude that the new chip has been designed for inference. Huang notes:

“You can take pretty much any large language model (LLM) you want and put it in this and it will inference like crazy.”

The CEO expects to see the inference cost of LLMs drop significantly, adding that the GH200 will be available from Nvidia’s distributors in the second quarter of 2024. However, Huang also projects that the new chip may already be available for sampling by the end of this year.

As of publication, Nvidia is yet to give an official price for the new chip.

Artificial Intelligence, Business News, News, Technology News
Related Articles