Huawei has introduced Flex:ai, an open-source orchestration tool designed to raise the utilization rate of AI chips in large-scale compute clusters. Announced on Friday, November 21, the platform builds on Kubernetes and will be released through Huawei’s ModelEngine developer community. It arrives amid continued U.S.export restrictions on high-end GPU hardware and reflects a growing shift inside China toward software-side efficiency gains as a stopgap for constrained silicon supply.
Aside from being equipped to help China “...create an analogue AI chip 1000 times faster than Nvidia’s chips,” Huawei claims Flex:ai can raise average utilization by around 30%. It reportedly does this by slicing individual GPU or NPU cards into multiple virtual compute instances and orchestrating workloads across heterogeneous hardware types.
Flex:ai’s architecture builds on existing open-source Kubernetes foundations but extends them in ways that are still uncommon across open deployments. Kubernetes already supports device plugins to expose accelerators and schedulers, such as Volcano, or frameworks like Ray can perform fractional allocation and gang scheduling. Flex:ai appears to unify them at a higher layer while integrating support for Ascend NPUs alongside standard GPU hardware.
The launch resembles functionality offered by Run:ai, an orchestration platform acquired by Nvidia in 2024, which enables multi-tenant scheduling and workload pre-emption across large GPU clusters. Huawei’s version, at least on paper, makes similar claims but does so with a focus on open-source deployment and cross-accelerator compatibility. That may give it broader relevance in clusters built around Chinese silicon, particularly those using Ascend chips.
Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

5 days ago
8







English (US) ·