Skip to main content
brand
context
industry
strategy
AaaS
Datasetmultilingualv1.1

TyDi QA Dataset

by Google Research · free · Last verified 2026-03-17

TyDi QA is a benchmark for question answering across 11 typologically diverse languages. It features information-seeking questions written by native speakers who have not seen the answer, ensuring real-world applicability. This design challenges models to generalize beyond high-resource, typologically similar languages.

https://huggingface.co/datasets/copenlu/answerable_tydiqa
B
BAbove Average
Adoption: B+Quality: AFreshness: BCitations: AEngagement: F

Specifications

License
Apache-2.0
Pricing
free
Capabilities
Multilingual Question Answering, Cross-Lingual Transfer Learning Evaluation, Reading Comprehension in Low-Resource Languages, Information-Seeking Intent Modeling, Typological Diversity Benchmarking, Extractive Question Answering, Zero-Shot and Few-Shot QA Evaluation, Robustness Testing for Morphological and Syntactic Variation
Integrations
Use Cases
[object Object], [object Object], [object Object], [object Object], [object Object]
API Available
No
Tags
question-answering, multilingual, typologically-diverse, google, information-seeking, nlp-benchmark, reading-comprehension, low-resource-languages, cross-lingual, extractive-qa
Added
2026-03-17
Completeness
0.8%

Index Score

66.9
Adoption
72
Quality
88
Freshness
68
Citations
82
Engagement
0

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service