Skip to main content
Datasetbenchmarksv1.0

GSM8K Dataset

by OpenAI · open-source · Last verified 2026-03-17

Grade School Math 8K is a dataset of 8,500 high-quality linguistically diverse grade school math word problems requiring 2-8 step reasoning. Created by OpenAI, GSM8K is widely used for evaluating multi-step arithmetic reasoning and the effectiveness of chain-of-thought prompting.

https://huggingface.co/datasets/openai/gsm8k
B+
B+Good
Adoption: A+Quality: A+Freshness: B+Citations: A+Engagement: F

Specifications

License
MIT
Pricing
open-source
Capabilities
math-evaluation, reasoning-benchmark, chain-of-thought
Integrations
huggingface-datasets, lm-eval-harness
Use Cases
model-evaluation, math-reasoning, chain-of-thought-research
API Available
No
Tags
benchmark, math, grade-school, word-problems, chain-of-thought
Added
2026-03-17
Completeness
100%

Index Score

79.8
Adoption
94
Quality
91
Freshness
74
Citations
96
Engagement
0

Put AI to work for your business

Deploy this dataset alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.

Explore the full AI ecosystem on Agents as a Service