AIME 2024
by MAA · free · Last verified 2026-03-01
A highly challenging benchmark for evaluating the mathematical reasoning of frontier AI models. It uses 30 problems from the 2024 American Invitational Mathematics Examination (AIME), which are designed to test creative problem-solving, multi-step deduction, and knowledge across number theory, geometry, algebra, and combinatorics.
https://artofproblemsolving.com/wiki/index.php/2024_AIME ↗B
B—Above Average
Adoption: B+Quality: AFreshness: ACitations: B+Engagement: F
Specifications
- License
- Public Domain
- Pricing
- free
- Capabilities
- evaluating advanced mathematical problem-solving, benchmarking multi-step logical reasoning chains, assessing creative and non-standard solution strategies, testing proficiency in number theory, geometry, and combinatorics, measuring performance on pre-olympiad level mathematics, gauging model ability for abstract thinking and symbolic manipulation, verifying formal proof construction and validation
- Integrations
- Use Cases
- [object Object], [object Object], [object Object], [object Object]
- API Available
- No
- Evaluated Models
- claude-4, gpt-5, gemini-2.5-pro, deepseek-v3
- Metrics
- solve-rate, average-score
- Methodology
- 30 AIME problems with integer answers 0-999. Models solve problems and provide final integer answers evaluated for exact match.
- Last Run
- 2026-02-15
- Tags
- benchmark, model-evaluation, mathematics, reasoning, llm-benchmark, competition-math, problem-solving, number-theory, geometry, combinatorics
- Added
- 2026-03-17
- Completeness
- 0.9%
Index Score
64.9Adoption
72
Quality
88
Freshness
82
Citations
74
Engagement
0