Today, Meta unveiled the latest result of its chip development efforts, notably following Intel’s recent announcement of its newest AI accelerator hardware. Dubbed the “next-gen” Meta Training and Inference Accelerator (MTIA), it serves as the successor to last year’s MTIA v1, powering models including those for ranking and recommending display ads on Meta’s platforms, such as Facebook.
Takeaways
- These MTIA chips are part of their growing investment in AI infrastructure and will enable Meta to deliver new and better experiences across all apps and technologies.
- Meta is introducing the next generation of Meta’s custom-made chips designed for our AI workloads.
- This latest version shows significant performance improvements over MTIA v1 and helps power our ranking and recommendation ads models.
- This latest version significantly improves performance compared to the last generation and helps power ranking and recommendation ads models on Facebook and Instagram.
Distinguished from its predecessor, MTIA v1, which was constructed on a 7nm process, the next-gen MTIA operates on a 5nm process. This advancement in chip manufacturing technology translates to a physically larger design housing more processing cores, albeit with higher power consumption at 90W compared to 25W. However, it compensates with increased internal memory (128MB versus 64MB) and a boosted average clock speed (1.35GHz up from 800MHz).
In its blog post, Meta highlights its rapid development cycle, boasting a timeline of fewer than nine months from initial silicon to production models for the next-gen MTIA. Yet, despite these strides, Meta faces a steep climb to achieve independence from third-party GPUs and match the pace set by its competitors.
Currently deployed across 16 data center regions, Meta asserts that the next-gen MTIA delivers up to three times better performance compared to MTIA v1. While the specifics behind this claim may seem elusive, Meta maintains that it stems from rigorous testing of “four key models” across both iterations.
Meta underscores its control over the entire hardware and software stack, asserting greater efficiency compared to commercially available GPUs. This holistic approach underscores Meta’s commitment to optimizing performance across its AI infrastructure.
The unveiling of Meta’s latest hardware marks an unconventional move, coming on the heels of a press briefing on the company’s ongoing generative AI initiatives. Notably, Meta discloses that while the next-gen MTIA is not currently utilized for generative AI training workloads, several programs are underway exploring this potential. Moreover, Meta acknowledges that the next-gen MTIA is intended to complement rather than replace GPUs for model training and execution.
Below this sits the runtime stack responsible for interfacing with the driver/firmware. The MTIA Streaming interface abstraction provides the basic and essential operations that both inference and (in the future) training software require to manage the device memory, as well as run operators and execute compiled graphs on the device.
Finally, the runtime interacts with the driver, which sits in user space – a decision we made to enable us to iterate faster on the driver and firmware within our production stack.
Amidst Meta’s measured progress, pressure mounts on its AI teams to streamline costs. With projected expenditures of $18 billion by 2024 on GPUs for training and running generative AI models, Meta sees in-house hardware development as a cost-effective alternative.
Meta’s Ongoing Investment in Custom Silicon
However, as Meta strives to gain ground, rivals surge ahead, posing challenges for Meta’s leadership. Google’s recent release of its fifth-generation custom chip for AI model training, TPU v5p, alongside its dedicated chip for model execution, Axion, underscores the competitive landscape. Similarly, Amazon and Microsoft have made significant strides in custom AI chip development, intensifying the race for AI supremacy.
The team is currently designing custom silicon to work in cooperation with their existing infrastructure as well as with new, more advanced hardware (including next-generation GPUs) that may be leveraged in the future.
They are currently running several programs underway aimed at expanding the scope of MTIA, including support for GenAI workloads.