Stay free if you only need available on all cloudflare plans including free and core proxying for 20+ ai providers. Upgrade if you need higher request and log volume limits and workers logpush for log export. Most solo builders can start free.
Why it matters: Adds an additional infrastructure dependency and proxy hop to every AI request
Available from: Paid (Usage-based)
Why it matters: Lacks the deep prompt versioning, evaluation, and dataset tooling of dedicated LLMOps platforms like LangSmith or Langfuse
Available from: Paid (Usage-based)
Why it matters: Many advanced features (Dynamic Routing, DLP, Guardrails, WebSockets, BYOK) are still in beta and may change
Available from: Paid (Usage-based)
Why it matters: Best value is realized only if you are already in or willing to adopt the Cloudflare ecosystem
Available from: Paid (Usage-based)
Why it matters: Configuration of dynamic routing JSON and fallback policies has a learning curve for sophisticated multi-provider setups
Available from: Paid (Usage-based)
AI Gateway adds minimal overhead — typically under 10ms — because it runs on Cloudflare's global edge network spanning 300+ cities. For cached responses, latency improves dramatically with sub-10ms response times served directly from the edge instead of the origin provider. The proxy is geographically close to both your application and the target AI provider, which often makes the round-trip faster than calling the provider directly. In practice, most users see net latency improvements once caching is enabled.
Yes — integration takes one line of code. You only change your API endpoint URL from the provider's direct endpoint (e.g., api.openai.com) to your AI Gateway endpoint, and all existing authentication, request formatting, and response handling remain unchanged. AI Gateway also offers a Unified API with OpenAI-compatible request schemas, so you can switch providers without rewriting client code. SDKs from OpenAI, Anthropic, and the Vercel AI SDK all work transparently. Adoption is intentionally frictionless for existing applications.
AI Gateway is available on all Cloudflare plans, including the free tier, with no credit card required to start. Core features like analytics, logging, caching, and rate limiting are accessible at the free level, while advanced features and higher request volumes scale through Cloudflare's standard usage-based pricing. Workers AI inference, Logpush, and persistent log storage may incur additional charges depending on volume. For exact rates check the Pricing page in the AI Gateway docs, since limits and pricing tiers are tied to your overall Cloudflare account plan.
AI Gateway supports 20+ providers natively including OpenAI, Anthropic, Google AI Studio, Google Vertex AI, Amazon Bedrock, Azure OpenAI, Workers AI, Cohere, DeepSeek, Mistral AI, Groq, Perplexity, Replicate, ElevenLabs, HuggingFace, OpenRouter, xAI, Cerebras, Baseten, Cartesia, Deepgram, Fal AI, Ideogram, and Parallel. There is also a Custom Providers beta for adding any HTTP-accessible model. The Unified API lets you call all of these with a single OpenAI-compatible schema, which makes multi-provider A/B testing and fallback trivial.
AI Gateway is primarily a proxy and traffic-control layer that runs on Cloudflare's edge — its strengths are caching, rate limiting, fallback, and infrastructure-level observability. Helicone is a closer feature match (proxy + analytics) but lacks deep Cloudflare-stack integration. LangSmith and Langfuse are LLMOps platforms focused on prompt engineering, evaluations, traces, and datasets — they offer richer developer-loop tooling but typically pair with, rather than replace, an edge proxy. Choose AI Gateway when you need production-grade traffic management on Cloudflare; choose Langfuse/LangSmith when prompt iteration and evaluation are the priority.
Start with the free plan — upgrade when you need more.
Get Started Free →Still not sure? Read our full verdict →
Last verified March 2026