Now that the AI industry has exceptionally high-performance GPUs with high-bandwidth memory (HBM), one of the bottlenecks that AI training and inference systems face is storage performance. To that end, Nvidia is working with partners to build SSDs that can hit random read performance of 100 million input/output operations per second (IOPS) in small-block workloads, according to Wallace C. Kuo, who spoke with Tom's Hardware in an exclusive interview.
"Right now, they are aiming for 100 million IOPS — which is huge," Kuo told Tom's Hardware.
Modern AI accelerators, such as Nvidia's B200, feature HBM3E memory bandwidth of around 8 TB/s, which significantly exceeds the capabilities of modern storage subsystems in both overall throughput and latency. Modern PCIe 5.0 x4 SSDs top at around 14.5 GB/s and deliver 2 – 3 million IOPS for both 4K and 512B random reads. Although 4K blocks are better suited for bandwidth, AI models typically perform small, random fetches, which makes 512B blocks a better fit for their latency-sensitive patterns. However, increasing the number of I/O operations per second by 33 times is hard, given the limitations of both SSD controllers and NAND memory.
In fact, Kioxia is already working on an 'AI SSD' based on its XL-Flash memory designed to surpass 10 million 512K IOPS. The company currently plans to release this drive during the second half of next year, possibly to coincide with the rollout of Nvidia's Vera Rubin platform. To get to 100 million IOPS, one might use multiple 'AI SSDs.'
However, the head of SMI believes that achieving 100 million IOPS on a single drive featuring conventional NAND with decent cost and power consumption will be extremely hard, so a new type of memory might be needed.
"I believe they are looking for a media change," said Kuo. "Optane was supposed to be the ideal solution, but it is gone now. Kioxia is trying to bring XL-NAND and improve its performance. SanDisk is trying to introduce High Bandwidth Flash (HBF), but honestly, I don't really believe in it. Right now, everyone is promoting their own technology, but the industry really needs something fundamentally new. Otherwise, it will be very hard to achieve 100 million IOPS and still be cost-effective."
Currently, many companies, including Micron and SanDisk, are developing new types of non-volatile memory. However, when these new types of memory will be commercially viable is something that even the head of Silicon Motion is not sure about.
Follow Tom's Hardware on Google News to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button.