Skip to main content
PaperLLMsv4.0

The Llama 4 Herd: The Beginning of a New Era of Natively Multimodal AI

by Meta AI · open-source · Last verified 2026-03-17

Llama 4 introduces a family of natively multimodal mixture-of-experts models—Scout (17B/16 experts), Maverick (17B/128 experts), and Behemoth (288B/16 experts)—pretrained jointly on text, image, and video data. Maverick achieves top scores on vision-language benchmarks while Scout offers 10M-token context at a fraction of the compute of comparable models.

https://arxiv.org/abs/2504.07557
B
BAbove Average
Adoption: AQuality: A+Freshness: A+Citations: BEngagement: F

Specifications

License
Llama 4 Community License
Pricing
open-source
Capabilities
text-generation, image-understanding, video-understanding, long-context, mixture-of-experts
Integrations
Hugging Face, Ollama, Together AI, Groq, AWS Bedrock
Use Cases
multimodal-reasoning, document-understanding, video-analysis, code-generation, long-document-qa
API Available
Yes
Tags
llm, multimodal, mixture-of-experts, meta, open-source, 2025
Added
2026-03-17
Completeness
100%

Index Score

67.5
Adoption
82
Quality
92
Freshness
98
Citations
65
Engagement
0

Put AI to work for your business

Deploy this paper alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.

Explore the full AI ecosystem on Agents as a Service