Skip to main content
brand
context
industry
strategy
AaaS
BenchmarkLLMsv1.1

TyDi QA

by Clark et al. / Google Research · free · Last verified 2026-03-17

TyDi QA is a multilingual question-answering benchmark featuring 11 typologically diverse languages. Questions are written natively by speakers of each language, ensuring genuine linguistic challenges and avoiding translation artifacts. It is designed to evaluate reading comprehension across a wide range of language structures.

https://ai.google.com/research/tydiqa
B
BAbove Average
Adoption: B+Quality: AFreshness: BCitations: B+Engagement: F

Specifications

License
Apache-2.0
Pricing
free
Capabilities
multilingual-question-answering, extractive-qa-evaluation, cross-lingual-transfer-assessment, reading-comprehension-benchmarking, typological-diversity-testing, zero-shot-evaluation, few-shot-evaluation
Integrations
Use Cases
[object Object], [object Object], [object Object], [object Object]
API Available
No
Evaluated Models
gpt-4o, multilingual-bert, xlm-roberta-large, gemini-2-5-pro
Metrics
f1-score, exact-match
Methodology
Gold passage task: model selects answer span from a provided Wikipedia passage. Goldp F1 and EM averaged across 11 languages. Primary task is span extraction; secondary task is answer presence detection (boolean).
Last Run
2025-12-20
Tags
question-answering, multilingual, typologically-diverse, reading-comprehension, nlp-benchmark, cross-lingual-transfer, dataset, evaluation, linguistic-diversity, extractive-qa
Added
2026-03-17
Completeness
0.9%

Index Score

66.1
Adoption
72
Quality
89
Freshness
67
Citations
78
Engagement
0

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service