Skip to main content
BenchmarkLLMsv1.0

DROP

by Allen AI · open-source · Last verified 2026-03-01

Discrete Reasoning Over Paragraphs benchmark requiring numerical reasoning over text passages. Tests abilities like addition, counting, sorting, and comparison that require understanding paragraph content and performing multi-step discrete operations.

https://allenai.org/data/drop
B
BAbove Average
Adoption: B+Quality: AFreshness: B+Citations: B+Engagement: F

Specifications

License
Apache-2.0
Pricing
open-source
Capabilities
model-evaluation, numerical-reasoning-testing, reading-comprehension-assessment
Integrations
lm-eval-harness
Use Cases
reasoning-evaluation, numerical-ability-testing, model-comparison
API Available
No
Evaluated Models
claude-4, gpt-5, gemini-2.5-pro, deepseek-v3
Metrics
f1-score, exact-match
Methodology
Reading comprehension with questions requiring discrete reasoning operations like counting, sorting, and arithmetic over passage content.
Last Run
2026-01-25
Tags
benchmark, evaluation, reading-comprehension, reasoning, numerical
Added
2026-03-17
Completeness
100%

Index Score

66.7
Adoption
76
Quality
84
Freshness
72
Citations
78
Engagement
0

Explore the full AI ecosystem on Agents as a Service