DROP
by Allen AI · open-source · Last verified 2026-03-01
Discrete Reasoning Over Paragraphs benchmark requiring numerical reasoning over text passages. Tests abilities like addition, counting, sorting, and comparison that require understanding paragraph content and performing multi-step discrete operations.
https://allenai.org/data/drop ↗B
B—Above Average
Adoption: B+Quality: AFreshness: B+Citations: B+Engagement: F
Specifications
- License
- Apache-2.0
- Pricing
- open-source
- Capabilities
- model-evaluation, numerical-reasoning-testing, reading-comprehension-assessment
- Integrations
- lm-eval-harness
- Use Cases
- reasoning-evaluation, numerical-ability-testing, model-comparison
- API Available
- No
- Evaluated Models
- claude-4, gpt-5, gemini-2.5-pro, deepseek-v3
- Metrics
- f1-score, exact-match
- Methodology
- Reading comprehension with questions requiring discrete reasoning operations like counting, sorting, and arithmetic over passage content.
- Last Run
- 2026-01-25
- Tags
- benchmark, evaluation, reading-comprehension, reasoning, numerical
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
66.7Adoption
76
Quality
84
Freshness
72
Citations
78
Engagement
0