PaperLLMsv4.0

The Llama 4 Herd: The Beginning of a New Era of Natively Multimodal AI

by Meta AI · open-source · Last verified 2026-03-17

Llama 4 introduces a family of natively multimodal mixture-of-experts models—Scout (17B/16 experts), Maverick (17B/128 experts), and Behemoth (288B/16 experts)—pretrained jointly on text, image, and video data. Maverick achieves top scores on vision-language benchmarks while Scout offers 10M-token context at a fraction of the compute of comparable models.

https://arxiv.org/abs/2504.07557 ↗

B—Above Average

Adoption: AQuality: A+Freshness: A+Citations: BEngagement: F

Specifications

License: Llama 4 Community License
Pricing: open-source
Capabilities: text-generation, image-understanding, video-understanding, long-context, mixture-of-experts
Integrations: Hugging Face, Ollama, Together AI, Groq, AWS Bedrock
Use Cases: multimodal-reasoning, document-understanding, video-analysis, code-generation, long-document-qa
API Available: Yes
Tags: llm, multimodal, mixture-of-experts, meta, open-source, 2025
Added: 2026-03-17
Completeness: 100%

Index Score

67.5

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service