Data Quality🔴Developer

Aegis DQ

Name: Aegis DQ
Brand: Aegis DQ
Availability: InStock

Agentic data quality MCP server that runs validation rules against data warehouses and diagnoses failures with AI.

Starting at$0

Visit Aegis DQ →

💡

In Plain English

Agentic data quality MCP server that runs validation rules against data warehouses and diagnoses failures with AI.

Overview

Aegis DQ is a free, Apache 2.0 open-source agentic data quality framework that turns business documentation into executable warehouse validation rules and AI-diagnosed failure reports for data engineers, analytics engineers, compliance teams, and AI-agent builders who need auditable checks across modern data warehouses.

Aegis DQ focuses on a specific gap in traditional data quality tooling: moving from “a rule failed” to “why it failed and how to fix it.” Users point Aegis at business docs such as policies, schema definitions, SLAs, and regulatory requirements, then run aegis generate to produce executable checks. The website states that Aegis can generate notnull, acceptedvalues, and complex custom_sql rules, including CTEs, window functions, and multi-table JOINs. Rules can then run through pipeline manifests that capture the database, rules, docs, LLM configuration, and goal in one reusable pipeline.yaml file. That same manifest can be run from the CLI, Airflow, GitHub Actions, or an MCP client.

The tool’s AI layer is its main differentiator. When checks fail, Aegis classifies severity and returns a plain-English diagnosis, root cause, remediation SQL, and an audit trail of LLM decisions with cost and latency. The provided website includes a real-world AML/fraud demo using 12 BSA/OFAC policies, 6 database tables, and 55 generated rules. In that demo, Aegis detected 11 violations covering CTR, OFAC, SAR, and structuring scenarios, with 5 CRITICAL violations shown in the Hermes example and a total Claude Haiku diagnosis cost of $0.01. The project also advertises 31 rule types and a 5-node LangGraph architecture covering plan, execute, reconcile, remediate, and report.

Aegis DQ is especially relevant for regulated environments where data checks must map back to policies, SLAs, or compliance obligations. It supports six warehouse targets listed on the website: DuckDB, Postgres, Redshift, BigQuery, Databricks, and Athena.

🎨

Vibe Coding Friendly?

▼

Difficulty:intermediate

Suitability for vibe coding depends on your experience level and the specific use case.

Learn about Vibe Coding →

Was this helpful?

Key Features

Business-doc-driven rule generation+

Aegis DQ can read policies, schema definitions, SLAs, and other business documents to generate executable validation rules. The website specifically mentions `not_null`, `accepted_values`, and complex `custom_sql` rules using CTEs, window functions, and multi-table JOINs.

LLM-powered failure diagnosis+

When validation fails, Aegis returns root cause analysis, severity tiering, plain-English explanations, and remediation SQL. It also logs every LLM decision with cost and latency, which is important for auditability and cost control.

Pipeline manifests+

A `pipeline.yaml` file can capture the database, rules, docs, LLM configuration, and goal in one reusable manifest. The same configuration can run from the CLI, Airflow, GitHub Actions, or any MCP client.

Hermes MCP integration+

Hermes connects to Aegis through MCP so users can load a pipeline and run validation in plain English. The website says Hermes can remember past runs, schedule recurring validations, diagnose failures, and deliver results where users work.

Multi-provider and offline LLM options+

The website lists Anthropic Claude, OpenAI, AWS Bedrock, and Ollama as LLM options. Teams can run no-LLM mode for rules only, use Ollama locally at $0 model cost, or use hosted LLMs when they want richer diagnosis and remediation output.

Pricing Plans

Open Source

See Full Pricing →Free vs Paid →Is it worth it? →

Ready to get started with Aegis DQ?

View Pricing Options →

Best Use Cases

🎯

Regulated AML or fraud monitoring workflows where data checks must map back to BSA, OFAC, SAR, CTR, structuring, or PEP oversight requirements and produce auditable explanations

⚡

Analytics engineering teams that already maintain business policies, schemas, and SLAs and want to generate `not_null`, `accepted_values`, and complex SQL validation rules from that documentation

🔧

Data platform teams that need the same validation pipeline to run from CLI during development, Airflow in production, GitHub Actions in CI, and MCP clients for agent workflows

🚀

Teams piloting conversational data quality operations through Hermes, where a user can ask an agent to load a pipeline, run checks, diagnose failures, and schedule recurring validations

💡

Organizations that need offline or low-cost validation options, using no-LLM mode for rules-only checks or Ollama for local model execution at $0 model cost

🔄

Data engineers investigating failed warehouse checks who need severity tiers, root cause summaries, and remediation SQL instead of manually tracing each failing rule

Limitations & What It Can't Do

We believe in transparent reviews. Here's what Aegis DQ doesn't handle well:

⚠The website does not list a hosted managed version, enterprise SLA, or vendor support package
⚠Aegis DQ is shown as v0.7.0, so production adopters should evaluate stability, migration practices, and community maturity carefully
⚠Only six warehouse targets are explicitly listed: DuckDB, Postgres, Redshift, BigQuery, Databricks, and Athena
⚠Root cause analysis and remediation quality depend on the selected LLM provider, prompt context, source documentation quality, and database metadata available to the tool
⚠The website provides a strong AML/fraud demo but does not publish broad benchmark results, user counts, or large-scale production adoption statistics

Pros & Cons

✓ Pros

✓Generates rules directly from business docs, policies, schema definitions, and SLAs, reducing the need to hand-author every validation rule
✓Provides plain-English root cause analysis, severity tiers, and remediation SQL for each failing rule instead of only reporting pass/fail status
✓Supports six warehouse targets listed on the website: DuckDB, Postgres, Redshift, BigQuery, Databricks, and Athena
✓Hermes MCP integration lets users load a pipeline, run validation, diagnose failures, remember past runs, and schedule recurring checks through plain English prompts
✓Pipeline manifests package database settings, rules, docs, LLM config, and goals into one reusable `pipeline.yaml` that can run through CLI, Airflow, GitHub Actions, or MCP clients
✓Apache 2.0 open source project with $0 no-LLM mode and $0 local Ollama option, plus transparent LLM cost tracking such as the $0.01 Claude Haiku AML demo

✗ Cons

✗No managed cloud offering or subscription pricing is shown on the website, so teams must self-host and operate it themselves
✗Advanced diagnosis depends on LLM configuration and model quality; without an LLM, Aegis can still run rules but loses the root cause and remediation layer
✗Rule generation quality depends on the completeness and accuracy of the business docs, policies, schemas, or SLAs provided
✗The project is shown as v0.7.0, which suggests a relatively early-stage release compared with mature enterprise data quality platforms
✗The website lists six warehouse integrations, which is useful but narrower than larger observability suites that cover many more data platforms and SaaS connectors

Frequently Asked Questions

What does Aegis DQ do?+

Aegis DQ turns business documentation into executable data quality rules, runs those rules against a warehouse, and diagnoses failures with an LLM. The website describes using policies, schema definitions, SLAs, and similar documents as input. Its output includes severity classification, root cause analysis, and remediation SQL, plus an audit trail of LLM cost and latency.

Which data warehouses does Aegis DQ support?+

The website lists DuckDB, Postgres, Redshift, BigQuery, Databricks, and Athena as supported warehouse targets. This covers common local, open source, cloud warehouse, and lakehouse environments. Teams using a warehouse outside those six should verify adapter support before adopting it for production workflows.

Does Aegis DQ require an LLM?+

Aegis DQ can run in `--no-llm` mode for rules-only validation at $0 model cost. However, the website’s main value proposition depends on LLM-powered diagnosis, including explanations, root causes, severity tiers, and remediation SQL. Supported LLM provider options listed on the site include Anthropic Claude, OpenAI, AWS Bedrock, and Ollama for local/offline usage.

How much does Aegis DQ cost?+

The website presents Aegis DQ as an Apache 2.0 open source project and does not list paid SaaS tiers. It highlights $0 no-LLM mode, $0 local Ollama usage, and a documented AML/fraud example where 55 rules and 11 diagnosed violations cost $0.01 using Claude Haiku. Real costs will depend on warehouse compute, hosting, and whichever LLM provider or local model configuration a team chooses.

How is Aegis DQ different from Great Expectations, Soda, or Monte Carlo?+

The Aegis DQ website positions it as a tool that explains why a data quality failure happened and how to fix it, not just that a rule failed. Great Expectations and Soda are often used for defining and running validation checks, while Monte Carlo is more focused on managed data observability. Aegis DQ is most distinctive when business docs should drive rule generation and when remediation SQL and audit trails matter.

🦞

New to AI tools?

Read practical guides for choosing and using AI tools

Read Guides →

Get updates on Aegis DQ and 370+ other AI tools

Weekly insights on the latest AI tools, features, and trends delivered to your inbox.

What's New in 2026

The website labels v0.7.0 as current and highlights new Pipeline Manifests plus Hermes MCP. It also lists a Hermes Integration marked NEW, including pipeline manifests, MCP tools, and conversational data quality through one prompt.

User Reviews

No reviews yet. Be the first to share your experience!

Quick Info

Try Aegis DQ Today

Get started with Aegis DQ and see if it's the right fit for your needs.

Get Started →

Need help choosing the right AI stack?

Take our 60-second quiz to get personalized tool recommendations

Find Your Perfect AI Stack →

Want a faster launch?

Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.

Browse Agent Templates →

More about Aegis DQ

Pricing Review Alternatives Free vs Paid Pros & Cons Worth It?Tutorial

Overview

Key Features

Business-doc-driven rule generation+

LLM-powered failure diagnosis+

Pipeline manifests+

Hermes MCP integration+

Multi-provider and offline LLM options+

Best Use Cases

🎯

Regulated AML or fraud monitoring workflows where data checks must map back to BSA, OFAC, SAR, CTR, structuring, or PEP oversight requirements and produce auditable explanations

⚡

Analytics engineering teams that already maintain business policies, schemas, and SLAs and want to generate `not_null`, `accepted_values`, and complex SQL validation rules from that documentation

🔧

Data platform teams that need the same validation pipeline to run from CLI during development, Airflow in production, GitHub Actions in CI, and MCP clients for agent workflows

🚀

Teams piloting conversational data quality operations through Hermes, where a user can ask an agent to load a pipeline, run checks, diagnose failures, and schedule recurring validations

💡

Organizations that need offline or low-cost validation options, using no-LLM mode for rules-only checks or Ollama for local model execution at $0 model cost

🔄

Data engineers investigating failed warehouse checks who need severity tiers, root cause summaries, and remediation SQL instead of manually tracing each failing rule

Limitations & What It Can't Do

We believe in transparent reviews. Here's what Aegis DQ doesn't handle well:

⚠The website does not list a hosted managed version, enterprise SLA, or vendor support package

⚠Aegis DQ is shown as v0.7.0, so production adopters should evaluate stability, migration practices, and community maturity carefully

⚠Only six warehouse targets are explicitly listed: DuckDB, Postgres, Redshift, BigQuery, Databricks, and Athena

⚠Root cause analysis and remediation quality depend on the selected LLM provider, prompt context, source documentation quality, and database metadata available to the tool

⚠The website provides a strong AML/fraud demo but does not publish broad benchmark results, user counts, or large-scale production adoption statistics

Pros & Cons

✓ Pros

✓Generates rules directly from business docs, policies, schema definitions, and SLAs, reducing the need to hand-author every validation rule
✓Provides plain-English root cause analysis, severity tiers, and remediation SQL for each failing rule instead of only reporting pass/fail status
✓Supports six warehouse targets listed on the website: DuckDB, Postgres, Redshift, BigQuery, Databricks, and Athena
✓Hermes MCP integration lets users load a pipeline, run validation, diagnose failures, remember past runs, and schedule recurring checks through plain English prompts
✓Pipeline manifests package database settings, rules, docs, LLM config, and goals into one reusable `pipeline.yaml` that can run through CLI, Airflow, GitHub Actions, or MCP clients
✓Apache 2.0 open source project with $0 no-LLM mode and $0 local Ollama option, plus transparent LLM cost tracking such as the $0.01 Claude Haiku AML demo

✗ Cons

✗No managed cloud offering or subscription pricing is shown on the website, so teams must self-host and operate it themselves
✗Advanced diagnosis depends on LLM configuration and model quality; without an LLM, Aegis can still run rules but loses the root cause and remediation layer
✗Rule generation quality depends on the completeness and accuracy of the business docs, policies, schemas, or SLAs provided
✗The project is shown as v0.7.0, which suggests a relatively early-stage release compared with mature enterprise data quality platforms
✗The website lists six warehouse integrations, which is useful but narrower than larger observability suites that cover many more data platforms and SaaS connectors