XTTS-v2
by Coqui AI · open-source · Last verified 2026-03-17
XTTS-v2 is Coqui AI's open-source cross-lingual text-to-speech model that achieves high-quality voice cloning from as few as 6 seconds of reference audio and supports 17 languages in a single model. It offers real-time streaming inference, making it a leading open-source choice for applications requiring custom voice synthesis and low-latency output.
https://coqui.ai/blog/tts/open_xtts ↗C+
C+—Average
Adoption: BQuality: AFreshness: B+Citations: BEngagement: F
Specifications
- License
- Coqui Public Model License
- Pricing
- open-source
- Capabilities
- text-to-speech, voice-cloning, multilingual-tts, real-time-streaming, emotion-control
- Integrations
- huggingface, local-inference
- Use Cases
- voice-cloning, audiobook-narration, virtual-assistants, game-characters, accessibility
- API Available
- Yes
- Parameters
- ~500M
- Context Window
- N/A
- Modalities
- text, audio
- Training Cutoff
- 2023
- Tags
- text-to-speech, voice-cloning, multilingual, open-source, coqui-tts
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
59.6Adoption
65
Quality
83
Freshness
70
Citations
68
Engagement
0