Skip to main content
DatasetSpeech & Audio AIv15.0

Common Voice 15

by Mozilla · open-source · Last verified 2026-03-17

Mozilla's Common Voice 15.0 is the world's largest publicly available multilingual speech corpus, containing over 30,000 hours of validated speech data across 114 languages, all contributed and validated by volunteers. It enables training and evaluation of multilingual and low-resource speech recognition systems.

https://commonvoice.mozilla.org
B+
B+Good
Adoption: AQuality: AFreshness: A+Citations: AEngagement: F

Specifications

License
CC-0
Pricing
open-source
Capabilities
multilingual-asr, low-resource-speech, speaker-diversity
Integrations
HuggingFace Datasets, ESPnet, SpeechBrain
Use Cases
model-training, multilingual-research, low-resource-asr
API Available
No
Tags
ASR, multilingual, crowdsourced, speech-recognition, open-source
Added
2026-03-17
Completeness
100%

Index Score

72.6
Adoption
88
Quality
82
Freshness
90
Citations
84
Engagement
0

Put AI to work for your business

Deploy this dataset alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.

Explore the full AI ecosystem on Agents as a Service