Four days after Meta unveiled the first Llama 4 models—Scout and Maverick—it’s apparent they weren’t just flexing muscles.
With great functionalities, Scout and Maverick arrive at a time when the world is drowning in chatbots that mostly sound the same. Meta’s new duo is different—and the differences are technical, strategic, and very human.
Let’s start with the basics. Llama 4 Scout is a 17-billion active parameter Mixture-of-Experts (MoE) model. It’s built with 16 experts, and it stretches context memory to 10 million tokens. That’s not a typo.
While others are stuck in the 128K lane, Scout is processing entire libraries of context at once—ten million tokens is enough to summarise a hundred books without breaking a sweat. It has already outperformed rivals like Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 across multiple benchmarks.
Maverick, on the other hand, is its flashier sibling—same 17B active parameters, but with 128 experts working behind the scenes. It’s a huge innovation in image-text grounding, outshining GPT-4o and Gemini 2.0 Flash, and holding its own against DeepSeek v3 on complex tasks like reasoning and coding—all while using fewer active parameters.
According to Meta, its chat version scored an ELO of 1417 on LMArena, a benchmark that pits models head-to-head in user-voted matchups.
What Makes These Models So Good?

The real trick lies in what’s behind Scout and Maverick: a still-in-training model called Llama 4 Behemoth. It has 288 billion active parameters and 16 experts. Meta hasn’t released it yet, but it’s already beating GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro in STEM-focused benchmarks.
That’s what’s powering the distilled intelligence in the smaller models—and that distillation process is what makes them unusually sharp for their size.
Scout and Maverick don’t just spit out answers. They understand multimodal inputs. They interpret long text chains, images, and even videos with surprising fluency.
This was made possible by a redesigned architecture that fuses text and visual tokens early in the process, letting the model “think” about them together rather than switching back and forth. The result is a far more fluid, natural performance in tasks that involve both reading and seeing.
Meta’s Strategy Is Bold and Global
These models aren’t locked behind a paywall or hidden in a lab. They’re already available in more than 40 countries, including Nigeria, Ghana, South Africa, and Zimbabwe, through WhatsApp, Instagram, Messenger, and the Meta.AI web app.
Multimodal features are only available in English and in the US for now—but Meta says they’re working on expanding access.
As for performance versus cost, Maverick brings what Meta calls a “best-in-class performance-to-cost ratio.” Translation? It’s really good, and it doesn’t take a data centre to run. That matters in a world where developers want high-performing models that won’t bankrupt them.
It’s Beyond Technical—It’s Personal
Meta is also tweaking the way these models respond to people. They’re more “steerable”—meaning you can tell them exactly how to behave and they’ll follow instructions without inserting moral judgments or personal bias. They’re also better at formatting responses, structuring replies clearly, and offering actionable suggestions.
According to Meta:
“Thanks to model improvements, Meta AI with Llama 4 is the assistant you can count on to provide helpful, factual responses without judgment. It responds conversationally and shares informative answers to more requests on a range of topics like personal advice, opinions and recommendations, and more.”
That’s a subtle but important shift. Rather than trying to be all-knowing or opinionated, Llama 4 models aim to be useful without being preachy.
What’s Coming Next?
Meta’s vision with Llama 4 isn’t just about releasing models—the company setting up an ecosystem. At the heart of this is the belief that openness fuels innovation.
Scout and Maverick are open-source. Anyone can download and experiment with them via llama.com or Hugging Face. That opens the door to new applications, personalised AI agents, and enterprise tools—all built on the same tech powering Meta’s consumer apps.
And then there’s Llama 4 Behemoth, still in training, still growing. When it drops, it could very well reset expectations again.
If this is the beginning, it’s already a big one. Scout and Maverick are Meta saying the future of AI is fast, efficient, multimodal, and more open than ever before.