Skip to main content
ModelSpeech & Audio AIvv2

XTTS-v2

by Coqui AI · open-source · Last verified 2026-03-17

XTTS-v2 is Coqui AI's open-source cross-lingual text-to-speech model that achieves high-quality voice cloning from as few as 6 seconds of reference audio and supports 17 languages in a single model. It offers real-time streaming inference, making it a leading open-source choice for applications requiring custom voice synthesis and low-latency output.

https://coqui.ai/blog/tts/open_xtts
C+
C+Average
Adoption: BQuality: AFreshness: B+Citations: BEngagement: F

Specifications

License
Coqui Public Model License
Pricing
open-source
Capabilities
text-to-speech, voice-cloning, multilingual-tts, real-time-streaming, emotion-control
Integrations
huggingface, local-inference
Use Cases
voice-cloning, audiobook-narration, virtual-assistants, game-characters, accessibility
API Available
Yes
Parameters
~500M
Context Window
N/A
Modalities
text, audio
Training Cutoff
2023
Tags
text-to-speech, voice-cloning, multilingual, open-source, coqui-tts
Added
2026-03-17
Completeness
100%

Index Score

59.6
Adoption
65
Quality
83
Freshness
70
Citations
68
Engagement
0

Explore the full AI ecosystem on Agents as a Service