Agentic data quality MCP server that runs validation rules against data warehouses and diagnoses failures with AI.
Agentic data quality MCP server that runs validation rules against data warehouses and diagnoses failures with AI.
Aegis DQ is a free, Apache 2.0 open-source agentic data quality framework that turns business documentation into executable warehouse validation rules and AI-diagnosed failure reports for data engineers, analytics engineers, compliance teams, and AI-agent builders who need auditable checks across modern data warehouses.
Aegis DQ focuses on a specific gap in traditional data quality tooling: moving from “a rule failed” to “why it failed and how to fix it.” Users point Aegis at business docs such as policies, schema definitions, SLAs, and regulatory requirements, then run aegis generate to produce executable checks. The website states that Aegis can generate notnull, acceptedvalues, and complex custom_sql rules, including CTEs, window functions, and multi-table JOINs. Rules can then run through pipeline manifests that capture the database, rules, docs, LLM configuration, and goal in one reusable pipeline.yaml file. That same manifest can be run from the CLI, Airflow, GitHub Actions, or an MCP client.
The tool’s AI layer is its main differentiator. When checks fail, Aegis classifies severity and returns a plain-English diagnosis, root cause, remediation SQL, and an audit trail of LLM decisions with cost and latency. The provided website includes a real-world AML/fraud demo using 12 BSA/OFAC policies, 6 database tables, and 55 generated rules. In that demo, Aegis detected 11 violations covering CTR, OFAC, SAR, and structuring scenarios, with 5 CRITICAL violations shown in the Hermes example and a total Claude Haiku diagnosis cost of $0.01. The project also advertises 31 rule types and a 5-node LangGraph architecture covering plan, execute, reconcile, remediate, and report.
Aegis DQ is especially relevant for regulated environments where data checks must map back to policies, SLAs, or compliance obligations. It supports six warehouse targets listed on the website: DuckDB, Postgres, Redshift, BigQuery, Databricks, and Athena.
Was this helpful?
Aegis DQ can read policies, schema definitions, SLAs, and other business documents to generate executable validation rules. The website specifically mentions `not_null`, `accepted_values`, and complex `custom_sql` rules using CTEs, window functions, and multi-table JOINs.
When validation fails, Aegis returns root cause analysis, severity tiering, plain-English explanations, and remediation SQL. It also logs every LLM decision with cost and latency, which is important for auditability and cost control.
A `pipeline.yaml` file can capture the database, rules, docs, LLM configuration, and goal in one reusable manifest. The same configuration can run from the CLI, Airflow, GitHub Actions, or any MCP client.
Hermes connects to Aegis through MCP so users can load a pipeline and run validation in plain English. The website says Hermes can remember past runs, schedule recurring validations, diagnose failures, and deliver results where users work.
The website lists Anthropic Claude, OpenAI, AWS Bedrock, and Ollama as LLM options. Teams can run no-LLM mode for rules only, use Ollama locally at $0 model cost, or use hosted LLMs when they want richer diagnosis and remediation output.
$0
Ready to get started with Aegis DQ?
View Pricing Options →We believe in transparent reviews. Here's what Aegis DQ doesn't handle well:
Weekly insights on the latest AI tools, features, and trends delivered to your inbox.
The website labels v0.7.0 as current and highlights new Pipeline Manifests plus Hermes MCP. It also lists a Hermes Integration marked NEW, including pipeline manifests, MCP tools, and conversational data quality through one prompt.
No reviews yet. Be the first to share your experience!
Get started with Aegis DQ and see if it's the right fit for your needs.
Get Started →Take our 60-second quiz to get personalized tool recommendations
Find Your Perfect AI Stack →Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.
Browse Agent Templates →