Skip to main content
ModelSpeech & Audio AIvVoicebox

Voicebox

by Meta AI · open-source · Last verified 2026-03-17

Voicebox is Meta AI's generative speech model based on non-autoregressive flow matching that achieves state-of-the-art performance on text-to-speech, noise removal, content editing, and style transfer tasks through a unified in-context learning approach. Its flow-matching architecture allows it to generalize to new voices and styles without fine-tuning, setting a new paradigm for zero-shot speech synthesis.

https://voicebox.metademolab.com
C+
C+Average
Adoption: DQuality: AFreshness: BCitations: B+Engagement: F

Specifications

License
Research Only
Pricing
open-source
Capabilities
text-to-speech, speech-editing, noise-removal, zero-shot-voice-cloning, cross-lingual-synthesis
Integrations
pytorch
Use Cases
research, speech-editing, voice-cloning, accessibility, multilingual-tts
API Available
No
Parameters
~330M
Context Window
N/A
Modalities
text, audio
Training Cutoff
2023
Tags
text-to-speech, speech-editing, in-context-learning, meta-ai, flow-matching
Added
2026-03-17
Completeness
100%

Index Score

50.8
Adoption
38
Quality
88
Freshness
66
Citations
72
Engagement
0

Explore the full AI ecosystem on Agents as a Service