Skip to main content
brand
context
industry
strategy
AaaS
Datasetmultilingualv2.0

XL-Sum Dataset

by BUET (Bangladesh University of Engineering and Technology) · free · Last verified 2026-03-17

XL-Sum is a massive multilingual dataset for abstractive summarization. It consists of over 1 million article-summary pairs scraped from BBC News, covering 44 different languages. This diversity makes it a crucial resource for developing and evaluating cross-lingual and multilingual summarization models.

https://huggingface.co/datasets/csebuetnlp/xlsum
B
BAbove Average
Adoption: B+Quality: AFreshness: B+Citations: B+Engagement: F

Specifications

License
CC-BY-NC-SA-4.0
Pricing
free
Capabilities
multilingual-text-summarization, cross-lingual-summarization-research, abstractive-summary-generation, low-resource-language-nlp, model-evaluation-and-benchmarking, transfer-learning-for-nlp, news-article-analysis
Integrations
Use Cases
[object Object], [object Object], [object Object], [object Object], [object Object]
API Available
No
Tags
summarization, multilingual, news, bbc, nlp-dataset, abstractive-summarization, cross-lingual, text-generation, low-resource-languages, sequence-to-sequence
Added
2026-03-17
Completeness
0.7%

Index Score

64.9
Adoption
71
Quality
85
Freshness
70
Citations
78
Engagement
0

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service