XL-Sum Dataset
by BUET (Bangladesh University of Engineering and Technology) · open-source · Last verified 2026-03-17
XL-Sum is a large-scale multilingual abstractive summarization dataset containing over 1 million professionally-written article-summary pairs scraped from BBC News across 44 languages. It enables cross-lingual and multilingual summarization research covering a diverse range of scripts and linguistic typologies.
https://huggingface.co/datasets/csebuetnlp/xlsum ↗B
B—Above Average
Adoption: B+Quality: AFreshness: B+Citations: B+Engagement: F
Specifications
- License
- CC-BY-NC-SA-4.0
- Pricing
- open-source
- Capabilities
- multilingual-summarization, abstractive-summarization
- Integrations
- huggingface-datasets
- Use Cases
- summarization, multilingual-nlp, news-processing
- API Available
- No
- Tags
- summarization, multilingual, news, 44-languages, bbc
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
64.9Adoption
71
Quality
85
Freshness
70
Citations
78
Engagement
0
Put AI to work for your business
Deploy this dataset alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.