Master Datadog LLM Observability with our step-by-step tutorial, detailed feature walkthrough, and expert tips.
Enable LLM Observability Navigate to the LLM Observability section in your Datadog dashboard and enable the feature for your organization. Ensure you have an active Datadog APM or Infrastructure subscription. Install the Datadog Tracing SDK Add the Datadog tracing library to your application (e.g., ddtrace for Python or dd
trace for Node.js) and configure it with your Datadog API key and site. Instrument LLM Calls Use auto
instrumentation for supported providers (OpenAI, Anthropic, Bedrock) or manually annotate LLM spans using the SDK. Auto
instrumentation detects LLM calls without code changes in most frameworks. Configure Evaluations and Alerts Enable built
in evaluations for quality, toxicity, and prompt injection detection. Set up Datadog monitors to alert on cost thresholds, error rates, or evaluation failures.
💡 Quick Start: Follow these 5 steps in order to get up and running with Datadog LLM Observability quickly.
Explore the key features that make Datadog LLM Observability powerful for analytics & monitoring workflows.
LangSmith and Langfuse are purpose-built LLM platforms focused on prompt engineering, dataset management, and developer-centric evaluation workflows. Datadog LLM Observability is built for production operations: it stitches LLM spans into the same distributed traces as your infrastructure, APM, and logs, and reuses Datadog's monitor, alerting, RBAC, and security detection systems. It is stronger for SRE and platform teams running AI in production, weaker for prompt iteration during development.
Datadog supports OpenAI, Anthropic, Amazon Bedrock, Azure OpenAI, Google Vertex AI, and other major providers, plus orchestration frameworks including LangChain, LlamaIndex, and OpenAI Assistants. Custom instrumentation is available through Datadog's SDKs for Python, Node.js, and other supported runtimes.
No. Datadog is a SaaS product and does not offer a self-hosted or on-prem version of LLM Observability. Teams with strict data residency requirements can choose between US, EU, and other regional Datadog sites, and sensitive data scrubbing can be applied client-side before telemetry is shipped.
Datadog offers built-in LLM-as-judge evaluations for quality, faithfulness, topic relevance, and toxicity, plus custom rule-based and code-based evaluators. Evaluations can run on sampled production traffic or on curated datasets, and results are stored alongside the trace so regressions are visible in the same UI as latency or cost spikes.
Yes. LLM Observability integrates with Datadog's Sensitive Data Scanner and detection rules engine to flag prompt injection attempts, jailbreaks, and PII or secrets that appear in prompts or responses. Findings can route to Datadog Cloud SIEM workflows for security teams to triage.
Now that you know how to use Datadog LLM Observability, it's time to put this knowledge into practice.
Sign up and follow the tutorial steps
Check pros, cons, and user feedback
See how it stacks against alternatives
Follow our tutorial and master this powerful analytics & monitoring tool in minutes.
Tutorial updated March 2026