Llama 3.2 11B Vision
by Meta · open-source · Last verified 2026-03-17
Meta's first multimodal Llama model with native image understanding capabilities at a compact 11B parameter size. Bridges text and vision tasks in a single open-source model suitable for diverse deployments.
https://llama.meta.com ↗B
B—Above Average
Adoption: B+Quality: B+Freshness: B+Citations: BEngagement: F
Specifications
- License
- Llama 3.2 Community License
- Pricing
- open-source
- Capabilities
- text-generation, image-understanding, visual-qa, instruction-following, multimodal-reasoning
- Integrations
- huggingface, ollama, vllm, together-ai
- Use Cases
- visual-qa, image-captioning, document-understanding, multimodal-chatbots, accessibility
- API Available
- No
- Parameters
- 11B
- Context Window
- 128K tokens
- Modalities
- text, image
- Training Cutoff
- December 2023
- Tags
- llm, open-source, multimodal, vision, compact, meta
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
60.8Adoption
72
Quality
75
Freshness
70
Citations
68
Engagement
0