TyDi QA Dataset
by Google Research · free · Last verified 2026-03-17
TyDi QA is a benchmark for question answering across 11 typologically diverse languages. It features information-seeking questions written by native speakers who have not seen the answer, ensuring real-world applicability. This design challenges models to generalize beyond high-resource, typologically similar languages.
https://huggingface.co/datasets/copenlu/answerable_tydiqa ↗B
B—Above Average
Adoption: B+Quality: AFreshness: BCitations: AEngagement: F
Specifications
- License
- Apache-2.0
- Pricing
- free
- Capabilities
- Multilingual Question Answering, Cross-Lingual Transfer Learning Evaluation, Reading Comprehension in Low-Resource Languages, Information-Seeking Intent Modeling, Typological Diversity Benchmarking, Extractive Question Answering, Zero-Shot and Few-Shot QA Evaluation, Robustness Testing for Morphological and Syntactic Variation
- Integrations
- Use Cases
- [object Object], [object Object], [object Object], [object Object], [object Object]
- API Available
- No
- Tags
- question-answering, multilingual, typologically-diverse, google, information-seeking, nlp-benchmark, reading-comprehension, low-resource-languages, cross-lingual, extractive-qa
- Added
- 2026-03-17
- Completeness
- 0.8%
Index Score
66.9Adoption
72
Quality
88
Freshness
68
Citations
82
Engagement
0