Kubernetes Model Serving
by AaaS · open-source · Last verified 2026-03-01
Deploys and manages LLM inference workloads on Kubernetes with GPU scheduling, auto-scaling based on queue depth, rolling updates, and canary deployments. Generates Helm charts and Kustomize configurations for reproducible deployments.
https://aaas.blog/script/kubernetes-model-serving ↗C
C—Below Average
Adoption: CQuality: AFreshness: ACitations: CEngagement: F
Specifications
- License
- MIT
- Pricing
- open-source
- Capabilities
- gpu-scheduling, auto-scaling, rolling-updates, canary-deployment, helm-generation
- Integrations
- @kubernetes/client-node, express, prom-client, winston
- Use Cases
- production-ml-serving, multi-model-management, gpu-cluster-management, enterprise-inference
- API Available
- No
- Language
- typescript
- Dependencies
- @kubernetes/client-node, express, prom-client, winston, js-yaml
- Environment
- Node.js 20+ with kubectl and Kubernetes cluster access
- Est. Runtime
- 3-10 minutes for deployment configuration
- Tags
- script, automation, kubernetes, serving, orchestration
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
46.1Adoption
48
Quality
82
Freshness
84
Citations
42
Engagement
0