Skip to main content
brand
context
industry
strategy
AaaS
BenchmarkLLMsv2024

AIME 2024

by MAA · free · Last verified 2026-03-01

A highly challenging benchmark for evaluating the mathematical reasoning of frontier AI models. It uses 30 problems from the 2024 American Invitational Mathematics Examination (AIME), which are designed to test creative problem-solving, multi-step deduction, and knowledge across number theory, geometry, algebra, and combinatorics.

https://artofproblemsolving.com/wiki/index.php/2024_AIME
B
BAbove Average
Adoption: B+Quality: AFreshness: ACitations: B+Engagement: F

Specifications

License
Public Domain
Pricing
free
Capabilities
evaluating advanced mathematical problem-solving, benchmarking multi-step logical reasoning chains, assessing creative and non-standard solution strategies, testing proficiency in number theory, geometry, and combinatorics, measuring performance on pre-olympiad level mathematics, gauging model ability for abstract thinking and symbolic manipulation, verifying formal proof construction and validation
Integrations
Use Cases
[object Object], [object Object], [object Object], [object Object]
API Available
No
Evaluated Models
claude-4, gpt-5, gemini-2.5-pro, deepseek-v3
Metrics
solve-rate, average-score
Methodology
30 AIME problems with integer answers 0-999. Models solve problems and provide final integer answers evaluated for exact match.
Last Run
2026-02-15
Tags
benchmark, model-evaluation, mathematics, reasoning, llm-benchmark, competition-math, problem-solving, number-theory, geometry, combinatorics
Added
2026-03-17
Completeness
0.9%

Index Score

64.9
Adoption
72
Quality
88
Freshness
82
Citations
74
Engagement
0

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service