Skip to main content
Datasetmultilingualv2.0

XL-Sum Dataset

by BUET (Bangladesh University of Engineering and Technology) · open-source · Last verified 2026-03-17

XL-Sum is a large-scale multilingual abstractive summarization dataset containing over 1 million professionally-written article-summary pairs scraped from BBC News across 44 languages. It enables cross-lingual and multilingual summarization research covering a diverse range of scripts and linguistic typologies.

https://huggingface.co/datasets/csebuetnlp/xlsum
B
BAbove Average
Adoption: B+Quality: AFreshness: B+Citations: B+Engagement: F

Specifications

License
CC-BY-NC-SA-4.0
Pricing
open-source
Capabilities
multilingual-summarization, abstractive-summarization
Integrations
huggingface-datasets
Use Cases
summarization, multilingual-nlp, news-processing
API Available
No
Tags
summarization, multilingual, news, 44-languages, bbc
Added
2026-03-17
Completeness
100%

Index Score

64.9
Adoption
71
Quality
85
Freshness
70
Citations
78
Engagement
0

Put AI to work for your business

Deploy this dataset alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.

Explore the full AI ecosystem on Agents as a Service