ModelLLMsvPixtral-12B-2409

Pixtral 12B

by Mistral AI · open-source · Last verified 2026-03-17

Mistral AI's natively multimodal model with a dedicated 400M parameter vision encoder alongside a 12B language backbone. Processes images at their native resolution without fixed-size tokenization.

https://mistral.ai/news/pixtral-12b/ ↗

C—Below Average

Adoption: BQuality: B+Freshness: B+Citations: FEngagement: F

Specifications

License: Apache 2.0
Pricing: open-source
Capabilities: text-generation, image-understanding, visual-qa, chart-analysis, document-understanding
Integrations: huggingface, vllm, ollama
Use Cases: visual-qa, document-analysis, chart-interpretation, image-captioning, multimodal-chatbots
API Available: Yes
Parameters: 12B
Context Window: 128K tokens
Modalities: text, image
Training Cutoff: Mid 2024
Tags: llm, open-source, multimodal, vision, natively-multimodal, mistral
Added: 2026-03-17
Completeness: 80%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need help choosing the right model?

Get Expert Guidance

Explore the full AI ecosystem on Agents as a Service