Complete pricing guide for Deepgram. Compare all plans, analyze costs, and find the perfect tier for your needs.
Not sure if free is enough? See our Free vs Paid comparison →
Still deciding? Read our full verdict on whether Deepgram is worth it →
mo
mo
mo
mo
Pricing sourced from Deepgram · Last verified March 2026
Deepgram's Nova model consistently posts the lowest word error rates in independent benchmarks, particularly on conversational audio with accents, crosstalk, or background noise. Real-world deployments report 15-30% relative WER reductions compared to Google Speech-to-Text and AWS Transcribe. Against AssemblyAI, Deepgram tends to win on streaming latency and pricing, while AssemblyAI is competitive on long-form batch accuracy. For multilingual conversational use, the new Flux model raises the bar further with built-in language detection across 10 languages.
Deepgram offers $200 in free credits on signup with no credit card required, which translates to roughly 750 hours of Nova streaming transcription. Pay-as-you-go STT pricing starts around $0.0043 per minute for pre-recorded Nova and $0.0077 per minute for streaming, with TTS billed per character. Growth and Enterprise tiers offer volume discounts, committed-use contracts, and custom model training. This pricing is typically 50-75% below Google Cloud Speech and AWS Transcribe at comparable quality levels.
End-to-end speech-to-text latency is typically 100-300ms over the WebSocket streaming API, with interim results returned even faster. The unified Voice Agent API further compresses round-trip time by collocating STT, LLM orchestration, and TTS — eliminating the network hops you'd see when stitching three separate vendors together. The new Flux model adds intelligent endpointing so the system reliably knows when a user has stopped speaking, which is critical for natural turn-taking in phone-quality conversations.
Yes — self-hosted deployment is one of Deepgram's key differentiators in the speech API category. Enterprise customers can run the same Nova and TTS models inside their own VPC, on-premises data centers, or air-gapped environments. This makes it viable for HIPAA-regulated medical transcription, financial services with data-residency rules, and government workloads. Most major cloud-only competitors do not offer a comparable self-hosted option.
Deepgram supports 30+ languages for transcription, with the new 2026 Flux model offering conversational STT in 10 languages including English, Spanish, German, French, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch with automatic language detection. Beyond raw transcription, the Audio Intelligence API adds summarization, sentiment analysis, topic detection, intent recognition, speaker diarization, and smart formatting. These can be applied to both batch files and live streams via flags on the same API call.
AI builders and operators use Deepgram to streamline their workflow.
Try Deepgram Now →