Meta Unveils Llama 4: New, More Powerful, Versatile AI Models | Meta WhatsApp | WhatsApp Business | Meta AI app | Turtles AI

Meta Unveils Llama 4: New, More Powerful, Versatile AI Models
Llama 4 Scout and Maverick deliver superior performance, optimized efficiency and advanced multi-modal capabilities, while Behemoth prepares to redefine the AI ​​benchmark
Isabella V6 April 2025

 

Meta has launched Llama 4 Scout and Maverick, advanced AI models with MoE architecture and multimodal capabilities, which offer superior performance. Behemoth, a teacher model, is in the training phase.

Key Points:

  • Innovative Models – Llama 4 Scout and Maverick introduce “mixture of experts” architecture for greater computational efficiency.
  •  High Performance – Outperforms GPT-4o and Gemini 2.0 Flash in reasoning and coding benchmarks.
  •  Advanced Multimodality – Supports text, images and video with smoother integration.
  •  Availability – Already available for download at llama.com and Hugging Face, with access via WhatsApp and Messenger.

Meta recently announced the launch of its new AI models, called Llama 4 Scout and Llama 4 Maverick, marking a significant advancement in the field of multimodal systems. These models are designed to process and integrate different types of data, including text, video, images, and audio, providing unprecedented versatility in AI applications. ​

Llama 4 Scout is a compact model optimized to run on a single NVIDIA H100 GPU, thanks to the use of Int4 quantization. It offers a context window of 10 million tokens, outperforming competitors such as Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 in various benchmarks. This large context window enables tasks such as multi-document summarization and reasoning across large code bases. ​

On the other hand, Llama 4 Maverick is a more advanced model with 17 billion active parameters and 128 experts, designed for high-level multimodal performance. Meta says it outperforms GPT-4o and Gemini 2.0 Flash in several benchmarks, achieving comparable results to the larger DeepSeek v3 in reasoning and coding tasks, despite using less than half the active parameters. Despite its size, Maverick runs on a single NVIDIA H100 host.

Both models employ a “mixture of experts” (MoE) architecture, activating only a subset of the total parameters per token to improve efficiency. This architectural choice is a first for the Llama series, allowing for more efficient management of computational resources.

To support these models, Meta is developing Llama 4 Behemoth, a teacher model with 288 billion active parameters and nearly two trillion total parameters. Although still in training, Meta says Behemoth outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro in STEM-focused benchmarks like MATH-500 and GPQA Diamond. Behemoth plays a key role in bringing knowledge to Scout and Maverick, although it is not yet publicly available.

Developers can download Llama 4 Scout and Maverick starting today at llama.com and Hugging Face. Meta is also rolling out partner access in the coming days. Users can experience Meta AI, powered by Llama 4, on platforms like WhatsApp, Messenger, Instagram Direct, and the Meta.AI website. Additional details, including technical insights and future plans for the Behemoth model, will be shared at LlamaCon on April 29.

These developments reflect Meta’s commitment to expanding the capabilities of AI and making these technologies accessible to a wide range of developers and users.

The adoption of advanced architectures like MoE and the integration of multimodal capabilities position Llama 4 Scout and Maverick as powerful tools for a variety of AI applications.