Master Sentry AI Monitoring with our step-by-step tutorial, detailed feature walkthrough, and expert tips.
Sign up for Sentry and create a new project, selecting 'AI Monitoring' as your platform type Install the appropriate Sentry SDK (Python, JavaScript, etc.) and configure it with your AI framework (LangChain, OpenAI, etc.) Add AI monitoring instrumentation to your agent code using Sentry's AI SDK extensions Deploy your instrumented AI application and verify that errors and performance data are appearing in the Sentry dashboard Configure AI
specific alerts for token limits, cost thresholds, and error rates based on your production requirements
💡 Quick Start: Follow these 2 steps in order to get up and running with Sentry AI Monitoring quickly.
Explore the key features that make Sentry AI Monitoring powerful for analytics & monitoring workflows.
Every LLM call is captured as a span within Sentry's distributed trace system, showing the complete call chain from user action through model invocation to response, including tool calls and retrieval steps in agent workflows.
A multi-step RAG pipeline with slow p95 latency: the trace view reveals that 70% of the latency comes from the vector database retrieval step, not the model inference—directing optimization effort correctly rather than guessing.
Automatic capture of input and output token counts for every model call, aggregated into usage trends by day, model, and endpoint. Cost estimates use current model pricing to translate token volume into dollar spend.
Engineering team catches a 5x token spike after a prompt template change deployed to production. The cost analytics dashboard shows the anomaly within hours, preventing a significant unbudgeted spend before end of billing cycle.
When AI pipeline errors occur, they appear in Sentry's standard issue tracker alongside application errors, with full trace context including the prompt sent, model response received, and surrounding application state. Standard Sentry alerting and grouping apply.
Investigating user reports of broken AI responses: searching Sentry for the relevant user session surfaces the exact prompt that triggered the failure, the content_filter finish reason, and the 3 preceding application errors that may have contributed.
Official integrations for OpenAI (Python and JavaScript), Anthropic, LangChain, and Vercel AI SDK require a single initialization call—no manual logging, no custom wrapper functions, no changes to existing model call code.
An engineering team adds Sentry AI monitoring to an existing OpenAI-powered application in 15 minutes by adding two lines to their SDK initialization, immediately gaining full trace coverage across all existing model calls.
Sentry's alerting rules apply to AI metrics—latency percentiles, error rates, token volume—with routing to PagerDuty, Slack, and OpsGenie. AI pipeline monitoring integrates into existing on-call workflows.
Setting a p95 latency alert for the customer-facing AI assistant that pages the on-call engineer when response times exceed 8 seconds, using the same PagerDuty routing as database and API availability alerts.
Sentry AI adds specialized tracking for LLM errors, token usage, conversation context, and AI-specific performance metrics.
Yes, AI monitoring features integrate seamlessly with existing Sentry projects and workflows.
Sentry has native SDKs for Python, JavaScript, and supports LangChain, OpenAI SDK, and custom integrations.
Sentry tracks LLM API costs through SDK instrumentation and provides dashboards and alerts for budget management.
Now that you know how to use Sentry AI Monitoring, it's time to put this knowledge into practice.
Sign up and follow the tutorial steps
Check pros, cons, and user feedback
See how it stacks against alternatives
Follow our tutorial and master this powerful analytics & monitoring tool in minutes.
Tutorial updated March 2026