Massive Multitask Language Understanding (MMLU)
by · · Last verified
A comprehensive benchmark designed to measure an AI model's knowledge across 57 subjects, ranging from humanities and social sciences to STEM. It assesses a model's ability to answer questions by presenting them in a multiple-choice format, requiring broad factual and world knowledge.
https://huggingface.co/datasets/cais/mmlu ↗F
F—Critical
Adoption: FQuality: FFreshness: A+Citations: FEngagement: F
Specifications
- API Available
- No
- Tags
- evaluation-benchmark, general-knowledge, multi-task, reasoning, academic, LLM-evaluation
- Added
- 2026-03-26
- Completeness
- undefined%
Index Score
0Adoption
0
Quality
0
Freshness
100
Citations
0
Engagement
0
Put AI to work for your business
Deploy this dataset alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.
Stay updated on the AI ecosystem
Get weekly insights on tools, models, agents, and more — curated by AI.