ModelSpeech & Audio AIvv2

Tortoise TTS

by James Betker (neonbjb) · open-source · Last verified 2026-03-17

Tortoise TTS is a highly expressive, open-source multi-voice text-to-speech system created by James Betker that achieves exceptional naturalness and zero-shot voice cloning quality through an autoregressive + diffusion pipeline, at the cost of significantly higher inference time than real-time TTS systems. Despite its slow generation speed, Tortoise remains a gold standard for open-source TTS quality and is widely used for offline audiobook and creative narration tasks.

https://github.com/neonbjb/tortoise-tts ↗

C—Below Average

Adoption: C+Quality: AFreshness: BCitations: FEngagement: F

Specifications

License: Apache 2.0
Pricing: open-source
Capabilities: text-to-speech, zero-shot-voice-cloning, high-expressiveness, multi-voice, long-form-narration
Integrations: huggingface, local-inference
Use Cases: audiobook-narration, voice-cloning-research, creative-projects, offline-tts, character-voices
API Available: Yes
Parameters: ~200M
Context Window: N/A
Modalities: text, audio
Training Cutoff: 2023
Tags: text-to-speech, voice-cloning, open-source, high-quality, slow-inference
Added: 2026-03-17
Completeness: 87%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need help choosing the right model?

Get Expert Guidance

Explore the full AI ecosystem on Agents as a Service