Skip to main content
Modelmultimodalv2.5

Qwen2.5-VL-72B

by Alibaba Cloud (Qwen Team) · free · Last verified 2026-03-17

Qwen2.5-VL-72B is Alibaba's flagship open vision-language model at 72 billion parameters, achieving top-tier performance on visual understanding benchmarks including chart analysis, document parsing, and fine-grained image understanding. It supports dynamic resolution image inputs and video understanding with native high-resolution processing.

https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct
B
BAbove Average
Adoption: B+Quality: A+Freshness: A+Citations: BEngagement: F

Specifications

License
Qwen License
Pricing
free
Capabilities
vision, visual-question-answering, document-understanding, video-understanding, ocr, agentic-vision
Integrations
Hugging Face, Qwen API, vLLM, Ollama
Use Cases
document-analysis, visual-qa, video-understanding, chart-interpretation, agentic-visual-tasks
API Available
Yes
Parameters
72B
Context Window
128K
Modalities
text, image, video
Training Cutoff
2024
Tags
alibaba, qwen, vision-language, open-source, frontier, large
Added
2026-03-17
Completeness
100%

Index Score

64
Adoption
72
Quality
91
Freshness
94
Citations
68
Engagement
0

Explore the full AI ecosystem on Agents as a Service