Meta Unleashes New Llama 4 AI Models
Meta has released the first two models from its Llama 4 suite: Llama 4 Maverick and Llama 4 Scout. The Maverick model is designed to be a “workhorse” for general assistant and chat use cases, while Scout is geared more toward “multi-document summarization, parsing extensive user activity for personalized tasks, and reasoning over vast codebases.”
Meta has introduced the Llama 4 Behemoth, which is considered one of the world’s smartest LLMs. Alongside, there is an upcoming fourth model, Llama 4 Reasoning, set to be released in a few weeks.
Responding to the AI Landscape
Many have been expecting Meta to respond to the “threat” posed by the rise of China’s DeepSeek, which reportedly performs on par with some top AI models. Meta directly references this competition in its blog post introducing Llama 4.
Meta announced the release of Llama 4 well before the LlamaCon on April 29th. This different approach has given developers ample time to experiment with the new models. The announcement, made on a Saturday, raised some eyebrows, but Meta's CEO Mark Zuckerberg simply stated, “That’s when it was ready.”
Model Specifications
With 17 billion active parameters and a total of 400 billion parameters distributed across 128 experts, the Llama 4 Maverick utilizes a Mixture of Experts (MoE) architecture. It is designed for efficiency, supports multimodal tasks, and can be deployed on a single NVIDIA H100 DGX host.

On the other hand, Llama 4 Scout offers 17 billion active parameters within a total of 109 billion parameters and 16 experts. Its unique feature is a 10 million token context window, enabling it to handle vast amounts of text effectively. Scout’s efficiency allows it to run on a single NVIDIA H100 GPU.
This is the first time MoE architecture has been used for the Llama models. This architecture divides tasks into smaller pieces and assigns them to specialized "expert" models, making training and answering queries more efficient.
Accessibility and Future Prospects
The Llama 4 models, Maverick and Scout, can be downloaded from the Llama website and Hugging Face. These models have been integrated into Meta AI, allowing accessibility through platforms like WhatsApp, Messenger, and Instagram DMs.
Meta believes that the capabilities of the Llama 4 collection will lead to better products and increased opportunities for developers to innovate on consumer and business use cases.
The Behemoth Model and Benchmarks
The upcoming Behemoth model offers more powerful hardware with 288 billion active parameters, 16 experts, and nearly 2 trillion total parameters. Meta’s internal benchmarking shows that Behemoth outperforms several top models on STEM skills evaluations.
Notably, Llama 4 models do not function as full-fledged reasoning models. Reasoning models are designed to fact-check responses and provide more reliable answers, albeit taking longer to generate results compared to traditional models.
Meta has fine-tuned the Llama 4 models to handle bias, particularly in chatbots discussing contentious topics. Despite efforts to address bias, the issue remains persistent in AI development.
Addressing User Concerns
Some users have reported experiencing mixed quality from Maverick and Scout, which Meta attributes to the early release of the models. The team is actively working on bug fixes and enhancing the user experience.

While Meta's performance claims for the Llama 4 series are based on various benchmarks, the AI community has raised concerns about potentially inflated scores due to optimization for benchmarks.










