NVIDIA Dynamo
by NVIDIA · open-source · Last verified 2026-04-24
NVIDIA Dynamo is a distributed inference framework for disaggregated prefill and decode serving of large language models across GPU clusters. It enables efficient scaling of inference beyond single-node constraints, distributing KV-cache and attention computation to maximize throughput for very large models on multi-node deployments.
https://github.com/ai-dynamo/dynamo ↗C
C—Below Average
Adoption: C+Quality: B+Freshness: ACitations: CEngagement: F
Specifications
- License
- Open Source
- Pricing
- open-source
- Capabilities
- Integrations
- Use Cases
- API Available
- No
- SDK Languages
- Tags
- inference, nvidia, distributed, disaggregated, kv-cache, multi-node, scale
- Added
- 2026-04-24
- Completeness
- 60%
Index Score
44Adoption
50
Quality
70
Freshness
80
Citations
40
Engagement
0