AMD nearly beats power efficiency goal a year early — AMD's new AI servers are 28.3 times more efficient than 2020 versions

1 week ago 6

Performance efficiency is key to the rapid increase in the performance of AI and HPC processors, so AMD and other companies fight for it fiercely with every new product generation. Back in 2021, the company set a goal of 2025 to increase the energy efficiency of its EPYC processors and Instinct accelerators by 30 times compared to 2020. It appears that with its latest EPYC 9005-series 'Turin' CPUs and Instinct MI300X GPUs, it has basically achieved its goal, but a year early.

AMD energy efficiency graph

(Image credit: AMD)

To prove its point, AMD used a machine equipped with two 64-core EPYC 9575F CPUs, eight Instinct MI300X accelerators, and 2,304 GB of DDR5 memory and tested its inference performance in Llama3.1-70B (vLLM 0.6.1.post2, TP8 Parallel, FP8, continuous batching) model. Using a complex set of calculations, AMD determined the energy efficiency of this system and compared it to an undisclosed machine from 2020, discovering that the new machine is 28.3 times more energy efficient than the old one.

AMD didn't disclose the specifications of its 2020 system, but we can imagine that it is based on the company's EPYC 7002-series processors, which feature the Zen 2 microarchitecture with up to 64 cores per CPU, and Instinct MI100 accelerators, which are based on the CDNA 1 architecture.

AMD's Instinct MI100 does not support FP8 (unlike MI300X, which supports it at the same rate as INT8), though if we compare INT8 performance of MI100 (184.6 TOPS) and MI300X (2615 TOPS/5230 TOPS with sparsity), the difference will be 14 – 28 times on paper. About the same difference can be observed with FP16, so the comparison is valid. When we factor in dramatically better memory subsystems (32 GB HBM2 at 1.20 GB/s vs. 192 GB HBM3 at 5.30 GB/s) and dramatically better CPUs, it does not come as a surprise that AMD's existing machines are dramatically faster and more performance efficient than its systems from 2020.

AMD itself says that in addition to 'brute force' hardware improvements, its higher performance efficiency was achieved by a combination of architectural advances and software optimizations, which is to be expected.

Just recently, the company introduced its Instinct MI325X accelerators based on the CDNA 3 architecture, yet featuring a 288 GB HBM3E memory subsystem. Next year, the company is set to roll out its Instinct MI355X processors, which will be based on the CDNA 4 architecture and boost compute FP8 and FP16 performance by about 80% compared to the MI325X. In addition to FP8 and FP16, the MI325X will add support for FP4 and FP6 formats for AI, which will increase its peak performance to 9.2 PetaFLOPS (FP4), something that will be useful for many large language models. That said, AMD is more than on track to achieve a 30 times higher energy efficiency of its compute platforms by 2025 when compared to 2020.

"With our thoughtful approach to hardware and software co-design, we are confident in our roadmap to exceed the 30x25 goal and excited about the possibilities ahead, where we see a path to massive energy efficiency improvements within the next couple of years," wrote Sam Naffziger, Senior Vice President, AMD Corporate Fellow, and Product Technology Architect at AMD.

Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.

AMD still sees room for improvement and even sees a pathway to improving energy efficiency by 100X by 2027

Read Entire Article