TGI (Text Generation Inference)
by Hugging Face · open-source · Last verified 2026-04-24
Hugging Face's Text Generation Inference (TGI) is a production-grade inference server for transformer models. It supports continuous batching, tensor parallelism, token streaming, and quantization, and is the backbone of Hugging Face's Inference Endpoints product. TGI provides a REST API compatible with major LLM frameworks.
https://huggingface.co/docs/text-generation-inference ↗C
C—Below Average
Adoption: C+Quality: B+Freshness: ACitations: CEngagement: F
Specifications
- License
- Open Source
- Pricing
- open-source
- Capabilities
- Integrations
- Use Cases
- API Available
- No
- SDK Languages
- Tags
- inference, hugging-face, open-source, continuous-batching, production, transformer
- Added
- 2026-04-24
- Completeness
- 60%
Index Score
44Adoption
50
Quality
70
Freshness
80
Citations
40
Engagement
0