Voicebox
by Meta AI · open-source · Last verified 2026-03-17
Voicebox is Meta AI's generative speech model based on non-autoregressive flow matching that achieves state-of-the-art performance on text-to-speech, noise removal, content editing, and style transfer tasks through a unified in-context learning approach. Its flow-matching architecture allows it to generalize to new voices and styles without fine-tuning, setting a new paradigm for zero-shot speech synthesis.
https://voicebox.metademolab.com ↗C+
C+—Average
Adoption: DQuality: AFreshness: BCitations: B+Engagement: F
Specifications
- License
- Research Only
- Pricing
- open-source
- Capabilities
- text-to-speech, speech-editing, noise-removal, zero-shot-voice-cloning, cross-lingual-synthesis
- Integrations
- pytorch
- Use Cases
- research, speech-editing, voice-cloning, accessibility, multilingual-tts
- API Available
- No
- Parameters
- ~330M
- Context Window
- N/A
- Modalities
- text, audio
- Training Cutoff
- 2023
- Tags
- text-to-speech, speech-editing, in-context-learning, meta-ai, flow-matching
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
50.8Adoption
38
Quality
88
Freshness
66
Citations
72
Engagement
0