Meta today announced four successive generations of its in-house Meta Training and Inference Accelerator (MTIA) chips, all developed in partnership with Broadcom and scheduled for deployment within the next two years. “We’ve developed a competitive strategy for MTIA by prioritizing rapid, iterative development, reads Meta’s press release, along with an inference-first focus and frictionless adoption by building natively on industry standards.
Go deeper with TH Premium: Chipmaking
Swipe to scroll horizontally
| Row 0 - Cell 0 | MTIA 300 | MTIA 400 | MTIA 450 | MTIA 500 |
Workload Focus | R&R Training | General | AI Inference | AI Inference |
Module TDP | 800 W | 1,200 W | 1,400 W | 1,700 W |
HBM Bandwidth | 6.1 TB/s | 9.2 TB/s | 18.4 TB/s | 27.6 TB/s |
HBM Capacity | 216 GB | 288 GB | 288 GB | 384-512 GB |
MX4 Performance | - | 12 PFLOPS | 21 PFLOPS | 30 PLOPS |
FP8/MX8 Performance | 1.2 PFLOPS | 6 PFLOPS | 7 PFLOPS | 10 PFLOPS |
BF16 Performance | 0.6 PLOPS | 3 PFLOPS | 3.5 PFLOPS | 5 PFLOPS |
Meta's approach also includes hardware acceleration for FlashAttention and mixture-of-experts feed-forward network computation, plus custom low-precision data types co-designed for inference. MTIA 450 supports MX4, delivering six times the MX4 FLOPs of FP16/BF16, with mixed low-precision computation that avoids the software overhead of data type conversion.
In terms of eventual deployment, MTIA 400, 450, and 500 will all use the same chassis, rack, and network infrastructure, meaning each new chip generation drops into the existing physical footprint for easy interchange. It’s this modularity, Meta says, that’s behind MTIA’s roughly six-month chip cadence, which itself is much faster than the industry’s typical one-to-two year cycle.
The software stack runs natively on PyTorch, vLLM, and Triton, with support for torch.compile and torch.export so that production models can be deployed simultaneously on both GPUs and MTIA without MTIA-specific rewrites. Meta said it has already deployed hundreds of thousands of MTIA chips across its apps for inference on organic content and ads.
All this comes just two weeks after Meta disclosed a long-term, $100 billion AI infrastructure agreement with AMD, suggesting that there’s a broader effort at play to reduce dependence on Nvidia across different parts of Meta’s AI stack while keeping MTIA at the core of inference workloads.
Article continues below
Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

3 weeks ago
25







English (US) ·