BenchmarkLLMsv2024

ARC-AGI

by Chollet / ARC Prize Foundation · open-source · Last verified 2026-03-17

ARC-AGI (Abstraction and Reasoning Corpus for Artificial General Intelligence) measures fluid intelligence through visual grid transformation puzzles. Models must infer transformation rules from three or fewer examples and apply them to a test grid — a task trivially solved by humans but historically extremely difficult for AI systems.

https://arcprize.org ↗

C+

C+—Average

Adoption: AQuality: A+Freshness: ACitations: FEngagement: F

Specifications

License: Apache-2.0
Pricing: open-source
Capabilities: evaluation, abstract-reasoning, few-shot-generalization
Integrations
Use Cases: model-evaluation, agi-research, reasoning-benchmarks
API Available: No
Evaluated Models: gpt-4o, claude-opus-4, gemini-2-5-pro, o3
Metrics: accuracy
Methodology: 400 public + 100 semi-private + 500 private tasks. Each task provides 3 input-output grid examples; model must predict the output for a test input. Evaluated with 2 submission attempts; accuracy = % tasks solved. Human baseline ≈ 85%.
Last Run: 2026-03-01
Tags: agi, abstract-reasoning, visual-patterns, few-shot, core-knowledge
Added: 2026-03-17
Completeness: 80%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service