TyDi QA
by Clark et al. / Google Research · free · Last verified 2026-03-17
TyDi QA is a multilingual question-answering benchmark featuring 11 typologically diverse languages. Questions are written natively by speakers of each language, ensuring genuine linguistic challenges and avoiding translation artifacts. It is designed to evaluate reading comprehension across a wide range of language structures.
https://ai.google.com/research/tydiqa ↗B
B—Above Average
Adoption: B+Quality: AFreshness: BCitations: B+Engagement: F
Specifications
- License
- Apache-2.0
- Pricing
- free
- Capabilities
- multilingual-question-answering, extractive-qa-evaluation, cross-lingual-transfer-assessment, reading-comprehension-benchmarking, typological-diversity-testing, zero-shot-evaluation, few-shot-evaluation
- Integrations
- Use Cases
- [object Object], [object Object], [object Object], [object Object]
- API Available
- No
- Evaluated Models
- gpt-4o, multilingual-bert, xlm-roberta-large, gemini-2-5-pro
- Metrics
- f1-score, exact-match
- Methodology
- Gold passage task: model selects answer span from a provided Wikipedia passage. Goldp F1 and EM averaged across 11 languages. Primary task is span extraction; secondary task is answer presence detection (boolean).
- Last Run
- 2025-12-20
- Tags
- question-answering, multilingual, typologically-diverse, reading-comprehension, nlp-benchmark, cross-lingual-transfer, dataset, evaluation, linguistic-diversity, extractive-qa
- Added
- 2026-03-17
- Completeness
- 0.9%
Index Score
66.1Adoption
72
Quality
89
Freshness
67
Citations
78
Engagement
0