A natural language processing (NLP) service that uses machine learning to find insights and relationships in text, including sentiment analysis, entity recognition, key phrase extraction, language detection, and PII redaction.
A natural language processing (NLP) service that uses machine learning to find insights and relationships in text, including sentiment analysis, entity recognition, key phrase extraction, language detection, and PII redaction.
Amazon Comprehend is a fully managed natural language processing (NLP) service from Amazon Web Services that uses machine learning to uncover insights and relationships in unstructured text. It is designed to help organizations process large volumes of documents, customer support tickets, product reviews, emails, and social media feeds without requiring in-house machine learning expertise. By abstracting away model training, infrastructure provisioning, and scaling, Comprehend allows developers and data teams to integrate advanced text analytics into applications through a simple API, the AWS SDKs, or direct integrations with other AWS services such as S3, Lambda, Kinesis, and Amazon OpenSearch Service.
The service provides a broad catalog of prebuilt NLP capabilities out of the box. These include sentiment analysis that classifies text as positive, negative, neutral, or mixed; entity recognition that identifies people, places, organizations, dates, quantities, events, and other entities; key phrase extraction that surfaces the most important noun phrases in a document; language detection across more than a hundred languages; syntax analysis for part-of-speech tagging; topic modeling across large document collections; and targeted sentiment analysis that associates sentiment with specific entities mentioned in the same text. Comprehend also ships with personally identifiable information (PII) detection and redaction, which is widely used to scrub sensitive data such as names, addresses, phone numbers, credit card numbers, and identifiers from text before it is stored, indexed, or shared downstream.
Beyond the general-purpose APIs, Amazon Comprehend offers custom classification and custom entity recognition, allowing teams to train domain-specific models on their own labeled data without writing ML code. Amazon Comprehend Medical is a specialized variant for healthcare and life sciences, extracting medical entities, medications, dosages, medical conditions, protected health information (PHI), and ICD-10-CM and RxNorm ontology links from clinical notes, discharge summaries, and trial records. The service is HIPAA eligible and integrates with other compliance-oriented AWS services, making it attractive for regulated industries.
Comprehend supports both real-time synchronous inference for single documents and short batches, as well as asynchronous batch jobs that can process millions of documents stored in S3. Pricing follows a usage-based model billed per unit of 100 characters, with a 12-month Free Tier that includes a generous monthly allowance for most operations, which lowers the cost of experimentation. Because it is a native AWS service, it inherits IAM-based access control, VPC endpoint support, CloudWatch monitoring, and CloudTrail auditing, which makes it straightforward to adopt in enterprises already standardized on AWS. Its main trade-offs are tighter coupling to the AWS ecosystem, per-character costs that can add up at very high volumes, and less flexibility than open-source frameworks such as spaCy or Hugging Face for teams that want to fully control model architecture and weights.
Was this helpful?
Teams can train custom text classification models and custom entity recognition models by uploading labeled training data in CSV or augmented manifest format. Comprehend handles all ML pipeline steps automatically — feature engineering, training, hyperparameter tuning, and evaluation — producing precision, recall, and F1 metrics for each trained model. Custom classifiers support multi-class and multi-label modes, while custom entity recognizers can identify domain-specific entities not covered by the pre-trained models. Trained models can be deployed to real-time inference endpoints or used for asynchronous batch processing, and up to 5 training jobs per month are included in the free tier.
Identifies over 30 types of personally identifiable information — including names, addresses, Social Security numbers, credit card numbers, phone numbers, email addresses, dates of birth, bank account numbers, and driver's license numbers. Supports both detection mode (returns entity types with confidence scores and character offsets) and redaction mode (returns text with PII replaced by entity type labels or redaction markers). This enables GDPR, CCPA, and HIPAA compliance workflows without building custom regex patterns or integrating third-party data masking tools. PII detection is available via both synchronous and asynchronous APIs.
A specialized variant that extracts medical entities such as conditions, medications, dosages, procedures, test results, and anatomical terms from clinical text. Links extracted entities to standard medical ontologies including ICD-10-CM (diagnoses), RxNorm (medications), and SNOMED CT (medical concepts), enabling structured data extraction from unstructured clinical notes, discharge summaries, and pathology reports. The service is HIPAA-eligible when used under an AWS Business Associate Agreement, making it one of the few managed NLP services certified for processing protected health information in production healthcare environments.
Goes beyond document-level sentiment to identify sentiment expressed toward specific entities mentioned in the text. For example, in a product review mentioning both battery life and screen quality, targeted sentiment can separately classify the sentiment toward each attribute — positive for screen quality and negative for battery life — rather than returning a single mixed sentiment score for the entire review. This entity-level granularity is valuable for product feedback analysis, brand monitoring, and competitive intelligence where understanding sentiment toward specific features or aspects is more actionable than overall document sentiment.
Processes large collections of documents stored in Amazon S3 via asynchronous batch jobs, supporting up to 5 GB of input data per job with individual documents up to 100 KB. Results are written back to S3 in JSON format. Supports all standard NLP APIs in batch mode, enabling cost-effective processing of millions of documents without managing infrastructure. Batch jobs scale automatically and are ideal for periodic analysis of large document repositories, ETL pipelines, and data lake enrichment workflows where real-time latency is not required.
Free for 12 months
Per 100-character unit, tiered by volume
Training + inference fees
Per-job pricing by document volume
Higher per-unit pricing than core APIs
Ready to get started with Amazon Comprehend?
View Pricing Options →We believe in transparent reviews. Here's what Amazon Comprehend doesn't handle well:
Weekly insights on the latest AI tools, features, and trends delivered to your inbox.
By 2026, Amazon Comprehend continues to sit within an AWS AI portfolio that has increasingly shifted toward generative AI via Amazon Bedrock, Titan, and integrations with foundation model providers. Comprehend's role has evolved into the specialized, deterministic NLP layer for structured extraction — sentiment, entities, PII redaction, and custom classifiers — that complements LLM-based workflows rather than competing with them. AWS has expanded integrations between Comprehend, Bedrock, and Amazon Q so that PII can be redacted from prompts and retrieval-augmented generation (RAG) pipelines, and so that Comprehend's custom entity recognizers can be used as tools alongside LLM agents. Comprehend Medical remains a focus area for healthcare customers, with deeper integration into AWS HealthLake and FHIR-based analytics. As always, consult the official AWS What's New feed and Comprehend release notes for the most current feature list, regional availability, and language support.
No reviews yet. Be the first to share your experience!
Get started with Amazon Comprehend and see if it's the right fit for your needs.
Get Started →Take our 60-second quiz to get personalized tool recommendations
Find Your Perfect AI Stack →Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.
Browse Agent Templates →