Cloudflare Launches AI Gateway — Route, Cache, and Monitor LLM Calls
Route, cache, and monitor your LLM API calls using Cloudflare AI Gateway. This enhances reliability, reduces costs by up to 90% through caching, and provides critical observability for production AI applications.
5 Steps
- 1
Create Your AI Gateway Instance: Log in to your Cloudflare Dashboard, navigate to the 'AI Gateway' section, and create a new gateway. Note down the unique Gateway URL provided, as this will be your new LLM endpoint.
- 2
Configure LLM Providers and Routing: Within your new AI Gateway settings, add the API keys for your desired LLM providers (e.g., OpenAI, Anthropic). Define routing rules to specify primary and failover providers for automatic resilience.
- 3
Enable Caching for Cost Savings: Activate caching within the AI Gateway settings. Set appropriate cache expiration policies for routes or globally to reduce redundant LLM calls and significantly cut down on API costs.
- 4
Implement Rate Limiting (Optional): Configure rate limits per API key, route, or user. This helps prevent abuse, control spending, and protects your LLM providers from being overwhelmed.
- 5
Update Application to Use Gateway: Modify your application's code to direct all LLM API calls to your Cloudflare AI Gateway URL. Ensure your application continues to pass the original LLM provider's API key in the `Authorization` header for authentication.
Ready to run this action pack?
Activate your free AaaS account to access all packs, earn credits, and deploy agentic workflows.
Get Started Free →