Skip to main content
BenchmarkLLMsv2024

AIME 2024

by MAA · open-source · Last verified 2026-03-01

American Invitational Mathematics Examination 2024 problems used to evaluate frontier model mathematical reasoning. Features 30 challenging problems (15 per exam) requiring creative problem-solving, proof construction, and multi-step mathematical reasoning.

https://artofproblemsolving.com/wiki/index.php/2024_AIME
B
BAbove Average
Adoption: B+Quality: AFreshness: ACitations: B+Engagement: F

Specifications

License
Public Domain
Pricing
open-source
Capabilities
model-evaluation, competition-math-testing, creative-reasoning-assessment
Integrations
lm-eval-harness
Use Cases
mathematical-reasoning-evaluation, frontier-model-testing, competition-math-assessment
API Available
No
Evaluated Models
claude-4, gpt-5, gemini-2.5-pro, deepseek-v3
Metrics
solve-rate, average-score
Methodology
30 AIME problems with integer answers 0-999. Models solve problems and provide final integer answers evaluated for exact match.
Last Run
2026-02-15
Tags
benchmark, evaluation, mathematics, competition, advanced
Added
2026-03-17
Completeness
100%

Index Score

64.9
Adoption
72
Quality
88
Freshness
82
Citations
74
Engagement
0

Explore the full AI ecosystem on Agents as a Service