Cerebras + LiteLLM
by LiteLLM · free · Last verified 2026-03-17
LiteLLM proxy integration for Cerebras Inference, enabling Cerebras's wafer-scale chip throughput to be accessed via a unified OpenAI-compatible gateway. Allows developers to route requests to Cerebras's CS-3 hardware — delivering over 2000 tokens/second on Llama 3.1 70B — from any existing OpenAI SDK integration through LiteLLM's model aliases.
https://docs.litellm.ai/docs/providers/cerebras ↗D
D—Poor
Adoption: DQuality: AFreshness: A+Citations: DEngagement: F
Specifications
- License
- MIT
- Pricing
- free
- Capabilities
- model-routing, openai-compatible-proxy, wafer-scale-inference, high-throughput, fallback-routing
- Integrations
- cerebras, litellm
- Use Cases
- high-throughput-inference, model-gateway, fastest-llm-inference, batch-summarization
- API Available
- Yes
- Tags
- cerebras, litellm, wafer-scale, fast-inference, model-gateway
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
37.8Adoption
35
Quality
84
Freshness
92
Citations
28
Engagement
0
Put AI to work for your business
Deploy this integration alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.