Qwen2.5-VL-72B
by Alibaba Cloud (Qwen Team) · free · Last verified 2026-03-17
Qwen2.5-VL-72B is Alibaba's flagship open vision-language model at 72 billion parameters, achieving top-tier performance on visual understanding benchmarks including chart analysis, document parsing, and fine-grained image understanding. It supports dynamic resolution image inputs and video understanding with native high-resolution processing.
https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct ↗B
B—Above Average
Adoption: B+Quality: A+Freshness: A+Citations: BEngagement: F
Specifications
- License
- Qwen License
- Pricing
- free
- Capabilities
- vision, visual-question-answering, document-understanding, video-understanding, ocr, agentic-vision
- Integrations
- Hugging Face, Qwen API, vLLM, Ollama
- Use Cases
- document-analysis, visual-qa, video-understanding, chart-interpretation, agentic-visual-tasks
- API Available
- Yes
- Parameters
- 72B
- Context Window
- 128K
- Modalities
- text, image, video
- Training Cutoff
- 2024
- Tags
- alibaba, qwen, vision-language, open-source, frontier, large
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
64Adoption
72
Quality
91
Freshness
94
Citations
68
Engagement
0