Skip to main content
brand
context
industry
strategy
AaaS
BenchmarkComputer Visionv1.0

MMMU

by CUHK / Waterloo · free · Last verified 2026-03-01

MMMU is a challenging multimodal benchmark designed to evaluate large models on expert-level tasks. It contains over 11,500 college-level problems spanning six core disciplines, requiring models to integrate deep subject knowledge with visual perception to answer multiple-choice questions with detailed reasoning.

https://mmmu-benchmark.github.io
B
BAbove Average
Adoption: B+Quality: A+Freshness: ACitations: B+Engagement: F

Specifications

License
Apache-2.0
Pricing
free
Capabilities
evaluating expert-level multimodal reasoning, assessing visual question answering in specialized domains, benchmarking large multimodal models (LMMs), testing knowledge across humanities, sciences, and engineering, measuring few-shot learning on complex problems, analyzing model performance on problems requiring chain-of-thought reasoning, providing a standardized test for college-level AI capabilities
Integrations
Use Cases
[object Object], [object Object], [object Object], [object Object]
API Available
No
Evaluated Models
claude-4, gpt-5, gemini-2.5-pro
Metrics
accuracy, per-discipline-accuracy
Methodology
College-level multiple-choice and open-ended questions with image inputs across 30 subjects. Tests both visual understanding and domain knowledge.
Last Run
2026-03-01
Tags
benchmark, evaluation, multimodal, reasoning, expert-level, lmm-evaluation, visual-question-answering, vqa, college-level, science-reasoning, chain-of-thought
Added
2026-03-17
Completeness
0.9%

Index Score

66.9
Adoption
76
Quality
90
Freshness
88
Citations
74
Engagement
0

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service