Exacluster reveals one of the industry's first clusters based on Nvidia's H200 Hopper GPUs for AI and HPC: 192 96-core CPUs

4 days ago 5

Will Bryk, chief executive of ExaAILabs, announced on Friday that his company had deployed its Exacluster, one of the industry's first clusters based on Nvidia's H200 GPUs for AI and HPC. The cluster will be used to build a search engine that can understand users better than Google and returns better search results than those produced by Google.

Truth be told, the Exacluster has nothing to do with ExaFLOPS-scale performance. It is called the Exacluster because it comprises 18 8-way NVIDIA H200-based servers (exa means quintillion, or 10^18). The cluster provides 144 H200 GPUs with 20TB of HBM3E memory (141GB of HBM3E per GPU), delivering a combined compute performance of 569,958 TOPS (around 570 PetaTOPS). The cluster will be used to train ExaAI's neural networks.

The cluster is based on 192 96-core processors (for 3,456 cores) and is equipped with 36TB of DDR5 memory and 270TB of NVMe solid-state storage. The supercomputer consumes 100kW of power. Only two of these machines are installed per rack to ensure that all servers get enough cooling. The machines use standard air cooling, which Bryk expects to be enough for prolonged operations under load.

We just finished setting up the Exacluster:- 144 H200s- 3456 CPUs- 270TB NVME SSD- 20TB GPU RAM- 36TB CPU RAM- 100KW operating powerPrepare yourselves for what's coming.. pic.twitter.com/Ulhp470SpzJanuary 10, 2025

The cluster cost is around $5 million (according to Bryk), which means $277,777 per machine, comparable to a single 8-way H200 baseboard, not the cost of the whole server. It is unclear how exactly ExaAI managed to get such a low price and H200-based machines ahead of many other companies. Still, Nvidia is one of the company's lead investors, along with Lightspeed and YCombinator.

Typically, companies affiliated with Nvidia in some way tend to get the company's hardware ahead of the others. Perhaps, given rather humble requirements, ExaAI managed to secure its machines even without using its significant connections just because its usage of AI is unique and poses a lot of interest to various parties. The ultimate goal of ExaAI is to build a search engine that can understand and process complex queries and return decent results. If the company succeeds, it could completely revolutionize the search as we know it.

Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.

Read Entire Article