LibriSpeech
by OpenSLR / Johns Hopkins University · free · Last verified 2026-03-17
LibriSpeech is a corpus of approximately 1,000 hours of 16kHz read English speech derived from LibriVox audiobooks, split into clean and other subsets of 100h and 360h for training, with dedicated development and test sets. It has become the de facto standard benchmark for English ASR systems.
https://www.openslr.org/12 ↗C+
C+—Average
Adoption: A+Quality: A+Freshness: C+Citations: FEngagement: F
Specifications
- License
- CC-BY-4.0
- Pricing
- free
- Capabilities
- speech-recognition, speech-synthesis, speaker-identification
- Integrations
- HuggingFace Datasets, torchaudio, ESPnet
- Use Cases
- model-training, benchmark, speech-research
- API Available
- No
- Tags
- automatic-speech-recognition, ASR, english, audiobooks, benchmark
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
56Adoption
95
Quality
92
Freshness
55
Citations
0
Engagement
0