Datasetv

Massive Multitask Language Understanding (MMLU)

by · free · Last verified

A comprehensive benchmark designed to measure an AI model's knowledge across 57 subjects, ranging from humanities to STEM. It assesses a model's understanding and reasoning capabilities in a zero-shot or few-shot setting, crucial for evaluating general intelligence.

https://huggingface.co/datasets/cais/mmlu ↗

F—Critical

Adoption: FQuality: FFreshness: A+Citations: FEngagement: F

Specifications

Pricing: free
Capabilities
Integrations
Use Cases
API Available: No
Tags: evaluation-benchmark, multitask, knowledge, reasoning, llm-evaluation, zero-shot, few-shot
Added: 2026-03-25
Completeness: 1%

Index Score

Adoption

Quality

Freshness

100

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service