BenchmarkComputer Visionv1.0

DocVQA

by CVC Barcelona · free · Last verified 2026-03-01

DocVQA is a large-scale dataset and benchmark for Visual Question Answering on document images. It challenges models to answer questions by reading and interpreting text, understanding layouts, and reasoning about information within complex documents like forms, invoices, and reports. It serves as a standard for evaluating document intelligence systems.

https://www.docvqa.org ↗

C—Below Average

Adoption: B+Quality: AFreshness: B+Citations: FEngagement: F

Specifications

License: Apache-2.0
Pricing: free
Capabilities: Benchmarking multimodal model performance, Evaluating visual question answering on documents, Assessing text extraction (OCR) in context, Testing comprehension of complex document layouts, Measuring reasoning over structured and unstructured text, Standardized evaluation for comparing document AI models, Providing a large-scale dataset of document images and QA pairs
Integrations
Use Cases: [object Object], [object Object], [object Object], [object Object]
API Available: No
Evaluated Models: claude-4, gpt-5, gemini-2.5-pro
Metrics: anls-score, accuracy
Methodology: Questions about scanned document images requiring text extraction and layout understanding. Evaluated using Average Normalized Levenshtein Similarity (ANLS) metric.
Last Run: 2026-02-10
Tags: benchmark, dataset, document-ai, document-understanding, evaluation, information-extraction, multimodal, nlp, ocr, vqa
Added: 2026-03-17
Completeness: 80%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service