Complete pricing guide for Cloudflare AI Gateway. Compare all plans, analyze costs, and find the perfect tier for your needs.
Not sure if free is enough? See our Free vs Paid comparison →
Still deciding? Read our full verdict on whether Cloudflare AI Gateway is worth it →
mo
mo
Pricing sourced from Cloudflare AI Gateway · Last verified March 2026
AI Gateway adds minimal overhead — typically under 10ms — because it runs on Cloudflare's global edge network spanning 300+ cities. For cached responses, latency improves dramatically with sub-10ms response times served directly from the edge instead of the origin provider. The proxy is geographically close to both your application and the target AI provider, which often makes the round-trip faster than calling the provider directly. In practice, most users see net latency improvements once caching is enabled.
Yes — integration takes one line of code. You only change your API endpoint URL from the provider's direct endpoint (e.g., api.openai.com) to your AI Gateway endpoint, and all existing authentication, request formatting, and response handling remain unchanged. AI Gateway also offers a Unified API with OpenAI-compatible request schemas, so you can switch providers without rewriting client code. SDKs from OpenAI, Anthropic, and the Vercel AI SDK all work transparently. Adoption is intentionally frictionless for existing applications.
AI Gateway is available on all Cloudflare plans, including the free tier, with no credit card required to start. Core features like analytics, logging, caching, and rate limiting are accessible at the free level, while advanced features and higher request volumes scale through Cloudflare's standard usage-based pricing. Workers AI inference, Logpush, and persistent log storage may incur additional charges depending on volume. For exact rates check the Pricing page in the AI Gateway docs, since limits and pricing tiers are tied to your overall Cloudflare account plan.
AI Gateway supports 20+ providers natively including OpenAI, Anthropic, Google AI Studio, Google Vertex AI, Amazon Bedrock, Azure OpenAI, Workers AI, Cohere, DeepSeek, Mistral AI, Groq, Perplexity, Replicate, ElevenLabs, HuggingFace, OpenRouter, xAI, Cerebras, Baseten, Cartesia, Deepgram, Fal AI, Ideogram, and Parallel. There is also a Custom Providers beta for adding any HTTP-accessible model. The Unified API lets you call all of these with a single OpenAI-compatible schema, which makes multi-provider A/B testing and fallback trivial.
AI Gateway is primarily a proxy and traffic-control layer that runs on Cloudflare's edge — its strengths are caching, rate limiting, fallback, and infrastructure-level observability. Helicone is a closer feature match (proxy + analytics) but lacks deep Cloudflare-stack integration. LangSmith and Langfuse are LLMOps platforms focused on prompt engineering, evaluations, traces, and datasets — they offer richer developer-loop tooling but typically pair with, rather than replace, an edge proxy. Choose AI Gateway when you need production-grade traffic management on Cloudflare; choose Langfuse/LangSmith when prompt iteration and evaluation are the priority.
AI builders and operators use Cloudflare AI Gateway to streamline their workflow.
Try Cloudflare AI Gateway Now →Open-source LLM observability platform and API gateway that provides cost analytics, request logging, caching, and rate limiting through a simple proxy-based integration requiring only a base URL change.
Compare Pricing →LangSmith lets you trace, analyze, and evaluate LLM applications and agents with deep observability into every model call, chain step, and tool invocation.
Compare Pricing →Leading open-source LLM observability platform for production AI applications. Comprehensive tracing, prompt management, evaluation frameworks, and cost optimization with enterprise security (SOC2, ISO27001, HIPAA). Self-hostable with full feature parity.
Compare Pricing →