FinanceBench
by Islam et al. / Patronus AI · open-source · Last verified 2026-03-17
FinanceBench evaluates LLMs on financial question-answering over publicly available company filings such as 10-Ks and earnings reports. It tests numerical reasoning, document retrieval, and the ability to answer questions requiring multi-step calculations over financial data.
https://github.com/patronus-ai/financebench ↗B
B—Above Average
Adoption: BQuality: AFreshness: B+Citations: B+Engagement: F
Specifications
- License
- CC BY-NC 4.0
- Pricing
- open-source
- Capabilities
- evaluation, financial-reasoning, document-qa
- Integrations
- Use Cases
- model-evaluation, financial-ai, rag-evaluation
- API Available
- No
- Evaluated Models
- gpt-4o, claude-opus-4, gemini-2-5-pro
- Metrics
- accuracy, exact-match
- Methodology
- 150 questions derived from public company filings with ground-truth answers verified by financial professionals. Models are evaluated with retrieval-augmented (open-book) and closed-book configurations; exact-match and human-judged accuracy are reported.
- Last Run
- 2026-02-10
- Tags
- finance, rag, numerical-reasoning, earnings, qa
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
62.8Adoption
68
Quality
88
Freshness
78
Citations
72
Engagement
0