Development

Supermemory

Name: Supermemory
Brand: Supermemory
Availability: InStock

Context engineering platform and memory layer for AI agents with user profiles, memory graph, retrieval capabilities, and enterprise APIs.

Starting at$0

Visit Supermemory →

Overview

Supermemory is a context engineering platform and memory infrastructure layer for AI agents that provides user profiles, a memory graph, retrieval, extractors, and connectors through a single unified API, with pricing starting free and scaling to $399/month. It targets AI developers, startups, and enterprise teams building agents that require persistent, cross-session understanding of users and data.

Based on our analysis of 870+ AI tools in the Development category, Supermemory differentiates itself by offering a full five-layer context stack (connectors, extractors, retrieval, graph, and profiles) rather than the single memory layer most competitors provide. The platform processes over 100 billion tokens monthly with a sub-300ms p95 latency, and claims the #1 position on MemoryBench as well as state-of-the-art results on LongMemEval (85.2%), LoCoMo, and ConvoMem benchmarks. Its custom-built Vector Graph Engine maps real relationships between memories using ontology-aware edges rather than relying purely on similarity scores, while the User Understanding Model builds deep behavioral profiles that capture intent and preferences.

For developers, Supermemory offers TypeScript, Python, and REST SDKs with a claimed 5-minute setup time, plus integrations with Claude Code, Cursor, OpenCode, OpenClaw, Vercel AI SDK, LangChain, LangGraph, CrewAI, OpenAI SDK, Mastra, Zapier, n8n, and Pipecat. The Personal Supermemory product serves over 10,000 power users with a Chrome extension and app that lets individuals capture links, chats, PDFs, images, and videos into a single memory shared across every AI tool they use. Enterprise deployments support self-hosting in customer VPCs with SOC 2, HIPAA, and GDPR compliance, and a guarantee that customer data is never used for model training. Compared to direct competitors Mem0 and Zep, Supermemory is the only option in the comparison table to offer all six capabilities simultaneously: memory graph, user profiles, document retrieval, connectors, document extractors, and consumer plugins.

🎨

Vibe Coding Friendly?

▼

Difficulty:intermediate

Suitability for vibe coding depends on your experience level and the specific use case.

Learn about Vibe Coding →

Was this helpful?

Key Features

Five-layer Context Stack+

Supermemory combines connectors, extractors, retrieval, graph, and profiles into one API. Most competitors offer only one or two of these layers, forcing teams to integrate multiple services. This consolidation reduces infrastructure cost and latency while giving agents richer context than a pure vector store can provide.

Vector Graph Engine+

Rather than relying purely on embedding similarity, the engine builds ontology-aware edges that map real relationships between memories. This lets retrieval surface connected concepts across projects, not just lexically similar chunks. Users on Twitter specifically highlight the graph visualization and cross-repo context linking as standout features.

User Understanding Model+

Supermemory builds deep behavioral profiles from user interactions, capturing intent, preferences, and context over time. This is what allows agents to move from recall ('you said X last Tuesday') to understanding ('you prefer dark mode and TypeScript, so here is a tailored answer'). It differentiates Supermemory from memory tools that only store and retrieve facts.

Sub-300ms p95 Latency at 100B+ Tokens/Month+

The platform processes more than 100 billion tokens monthly while maintaining sub-300ms 95th-percentile retrieval latency. This makes it viable for real-time applications like voice agents, where one user reported reducing average response time from 40s to 12s by switching from traditional RAG to Supermemory. It is also one of the few memory providers to publish p95 latency numbers at this scale.

Self-hostable Enterprise Deployment with Compliance+

Enterprise customers can deploy Supermemory inside their own VPC and cloud environment, with SOC 2, HIPAA, and GDPR certifications in place. Supermemory commits in writing to never training models on customer data and allows full data export at any time. This combination is rare among memory-layer startups and is why regulated teams adopt it.

Pricing Plans

Free

✓1M tokens/month
✓10K search queries/month
✓Unlimited storage & users
✓Free multi-modal extraction
✓Email support

Pro

$19/month

✓3M tokens/month
✓100K search queries/month
✓Unlimited storage & users
✓Free multi-modal extraction
✓Priority support
✓All plugins (Claude Code, Cursor, OpenCode, OpenClaw)

Scale

$399/month

✓80M tokens/month
✓20M search queries/month
✓Unlimited storage & users
✓Free multi-modal extraction
✓Dedicated support
✓Gmail, S3, Web Crawler connectors

Enterprise

Custom

✓Unlimited tokens
✓Unlimited search queries
✓Forward-deployed engineer
✓Custom integrations & SSO
✓Self-host in your VPC
✓SOC 2, HIPAA, GDPR compliance

See Full Pricing →Free vs Paid →Is it worth it? →

Ready to get started with Supermemory?

View Pricing Options →

Best Use Cases

🎯

Adding persistent long-term memory to a LangChain, LangGraph, or CrewAI agent so it remembers user preferences and past conversations across sessions

⚡

Replacing a fragmented stack of vector DB + metadata store + user profile service with a single API for AI-native SaaS products

🔧

Building personal AI assistants that unify context across Claude Code, Cursor, ChatGPT, and browser activity via the Chrome extension

🚀

Deploying a compliant memory layer inside a customer VPC for regulated industries (healthcare, finance, legal) that require SOC 2, HIPAA, or GDPR

💡

Powering voice and chat agents with sub-300ms memory recall to keep response latency low in real-time applications

🔄

Migrating existing ChatGPT or assistant history into a structured, queryable memory graph that multiple downstream agents can share

Limitations & What It Can't Do

We believe in transparent reviews. Here's what Supermemory doesn't handle well:

⚠Large pricing gap between the $19 Pro tier and the $399 Scale tier with no middle option for growing teams
⚠Key data-source connectors (Gmail, S3, Web Crawler) are locked behind the Scale tier
⚠Token and query overage pricing means costs can become unpredictable at scale without careful monitoring
⚠Enterprise features (SSO, forward-deployed engineer, custom integrations) require a sales conversation with no published pricing
⚠As a newer platform, long-term stability data and third-party case studies are limited compared to established retrieval tools

Pros & Cons

✓ Pros

✓Only platform in its comparison set offering all five context layers (connectors, extractors, retrieval, graph, profiles) in a single API
✓Verifiable performance leadership: 85.2% on LongMemEval and #1 rankings on LoCoMo, ConvoMem, and MemoryBench benchmarks
✓Proven production scale, handling 100B+ tokens monthly with sub-300ms p95 latency
✓Broad ecosystem with 14+ named integrations including LangChain, LangGraph, CrewAI, Vercel AI SDK, and Zapier
✓Generous free tier with 1M tokens/month and 10K search queries, with Pro tier at just $19/month
✓Enterprise-ready with SOC 2, HIPAA, GDPR, self-hosting in customer VPC, and a no-training data policy

✗ Cons

✗Scale tier jumps sharply from $19/month Pro to $399/month, leaving a large gap for mid-sized teams
✗Gmail, S3, and Web Crawler connectors are gated to the $399 Scale tier and above
✗Overage charges ($0.01 per 1,000 tokens, $0.10 per 1,000 queries) can add up for unpredictable workloads
✗As a newer memory-layer category, best practices and community tutorials are still maturing compared to established vector DBs
✗Enterprise features like SSO, forward-deployed engineers, and custom integrations require a custom-priced contract with no public pricing

Frequently Asked Questions

How does Supermemory differ from a traditional vector database?+

Supermemory is not another vector database — it is a custom-built engine that combines a Vector Graph Engine with a User Understanding Model. Unlike pure vector stores that only compute similarity scores, Supermemory maps ontology-aware edges that represent real relationships between memories, and builds behavioral profiles of users from their interactions. This means agents can retrieve not just semantically similar chunks but contextually connected knowledge, including user intent and preferences. It also bundles connectors, extractors, and retrieval in a single API so teams don't have to stitch together five services.

How much does Supermemory cost and what is included in each tier?+

Supermemory has four tiers: Free ($0 with 1M tokens/month and 10K queries/month), Pro ($19/month with 3M tokens and 100K queries plus all plugins), Scale ($399/month with 80M tokens, 20M queries, and Gmail/S3/Web Crawler connectors), and Enterprise (custom pricing with unlimited usage, forward-deployed engineer, SSO, and custom integrations). All plans include unlimited storage, unlimited users, and free multi-modal extraction. Overages on Pro and Scale are charged at $0.01 per 1,000 tokens and $0.10 per 1,000 queries. Qualifying startups can apply for $1,000 in credits and 6 months of dedicated support.

Can I self-host Supermemory or keep my data in my own cloud?+

Yes. The Enterprise plan supports self-hosting inside your own VPC and cloud environment, giving you full control over infrastructure and data residency. Supermemory is also certified to SOC 2, HIPAA, and GDPR standards. The company explicitly states it does not train models on customer data and that you can export your data at any time. This makes it viable for regulated industries like healthcare, finance, and legal tech that cannot send data to third-party SaaS.

Which AI frameworks and tools does Supermemory integrate with?+

Supermemory ships with SDKs in TypeScript, Python, and a REST API, plus native integrations with Claude Code, OpenClaw, OpenCode, Vercel AI SDK, LangChain, LangGraph, CrewAI, OpenAI SDK, Mastra, Zapier, n8n, and Pipecat. There are also consumer plugins including a Chrome extension and desktop apps for saving links, chats, PDFs, images, and videos. This range of 14+ integrations means teams can adopt Supermemory without rewriting their existing agent stack — three lines of code are typically enough to add it to an existing LangChain or CrewAI project.

Who is Supermemory best suited for?+

Supermemory is best suited for three audiences: AI developers building agents that need long-term memory across sessions; startups and scale-ups that need production-grade retrieval with sub-300ms latency without building it in-house; and enterprises requiring self-hosted, compliant memory infrastructure for regulated workloads. Individual power users (10,000+ of them) also use the Personal Supermemory app to unify memory across Claude, Cursor, ChatGPT, and other assistants. Teams that only need basic RAG over a small document set may find it more than they need, while those juggling multiple memory tools will benefit from the consolidated API.

🦞

New to AI tools?

Learn how to run your first agent with OpenClaw

Learn OpenClaw →

Get updates on Supermemory and 370+ other AI tools

Weekly insights on the latest AI tools, features, and trends delivered to your inbox.

What's New in 2026

As of early 2026, Supermemory publicly claims the #1 position on MemoryBench (its own open eval platform) across latency, quality, and cost. The platform now processes over 100 billion tokens monthly and reports state-of-the-art results on LongMemEval (85.2%), LoCoMo, and ConvoMem benchmarks. Recent integrations highlighted in 2026 testimonials include OpenClaw, Mastra, and Pipecat, and users have shared workflows for migrating full ChatGPT histories into Supermemory containers.

Alternatives to Supermemory

Mem0

AI Memory & Search

Mem0: Universal memory layer for AI agents and LLM applications. Self-improving memory system that personalizes AI interactions and reduces costs.

Zep

AI Memory & Search

Context engineering platform that builds temporal knowledge graphs from conversations and business data, delivering personalized context to AI agents with <200ms retrieval latency.

Pinecone

AI Memory & Search

Vector database designed for AI applications that need fast similarity search across high-dimensional embeddings. Pinecone handles the complex infrastructure of vector search operations, enabling developers to build semantic search, recommendation engines, and RAG applications with simple APIs while providing enterprise-scale performance and reliability.

Weaviate

AI Memory & Search

Open-source vector database enabling hybrid search, multi-tenancy, and built-in vectorization modules for AI applications requiring semantic similarity and structured filtering combined.

LangChain

AI Agent Builders

The industry-standard framework for building production-ready LLM applications with comprehensive tool integration, agent orchestration, and enterprise observability through LangSmith.

View All Alternatives & Detailed Comparison →

User Reviews

No reviews yet. Be the first to share your experience!

Quick Info

Try Supermemory Today

Get started with Supermemory and see if it's the right fit for your needs.

Get Started →

Need help choosing the right AI stack?

Take our 60-second quiz to get personalized tool recommendations

Find Your Perfect AI Stack →

Want a faster launch?

Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.

Browse Agent Templates →

More about Supermemory

Pricing Review Alternatives Free vs Paid Pros & Cons Worth It?Tutorial

Overview

Key Features

Five-layer Context Stack+

Vector Graph Engine+

User Understanding Model+

Sub-300ms p95 Latency at 100B+ Tokens/Month+

Self-hostable Enterprise Deployment with Compliance+

Pricing Plans

Free

✓1M tokens/month
✓10K search queries/month
✓Unlimited storage & users
✓Free multi-modal extraction
✓Email support

Pro

$19/month

✓3M tokens/month
✓100K search queries/month
✓Unlimited storage & users
✓Free multi-modal extraction
✓Priority support
✓All plugins (Claude Code, Cursor, OpenCode, OpenClaw)

Scale

$399/month

✓80M tokens/month
✓20M search queries/month
✓Unlimited storage & users
✓Free multi-modal extraction
✓Dedicated support
✓Gmail, S3, Web Crawler connectors

Enterprise

Custom

✓Unlimited tokens
✓Unlimited search queries
✓Forward-deployed engineer
✓Custom integrations & SSO
✓Self-host in your VPC
✓SOC 2, HIPAA, GDPR compliance

Ready to get started with Supermemory?

View Pricing Options →

Best Use Cases

🎯

Adding persistent long-term memory to a LangChain, LangGraph, or CrewAI agent so it remembers user preferences and past conversations across sessions

⚡

Replacing a fragmented stack of vector DB + metadata store + user profile service with a single API for AI-native SaaS products

🔧

Building personal AI assistants that unify context across Claude Code, Cursor, ChatGPT, and browser activity via the Chrome extension

🚀

Deploying a compliant memory layer inside a customer VPC for regulated industries (healthcare, finance, legal) that require SOC 2, HIPAA, or GDPR

💡

Powering voice and chat agents with sub-300ms memory recall to keep response latency low in real-time applications

🔄

Migrating existing ChatGPT or assistant history into a structured, queryable memory graph that multiple downstream agents can share

Limitations & What It Can't Do

We believe in transparent reviews. Here's what Supermemory doesn't handle well:

⚠Large pricing gap between the $19 Pro tier and the $399 Scale tier with no middle option for growing teams

⚠Key data-source connectors (Gmail, S3, Web Crawler) are locked behind the Scale tier

⚠Token and query overage pricing means costs can become unpredictable at scale without careful monitoring

⚠Enterprise features (SSO, forward-deployed engineer, custom integrations) require a sales conversation with no published pricing

⚠As a newer platform, long-term stability data and third-party case studies are limited compared to established retrieval tools

Pros & Cons

✓ Pros

✓Only platform in its comparison set offering all five context layers (connectors, extractors, retrieval, graph, profiles) in a single API
✓Verifiable performance leadership: 85.2% on LongMemEval and #1 rankings on LoCoMo, ConvoMem, and MemoryBench benchmarks
✓Proven production scale, handling 100B+ tokens monthly with sub-300ms p95 latency
✓Broad ecosystem with 14+ named integrations including LangChain, LangGraph, CrewAI, Vercel AI SDK, and Zapier
✓Generous free tier with 1M tokens/month and 10K search queries, with Pro tier at just $19/month
✓Enterprise-ready with SOC 2, HIPAA, GDPR, self-hosting in customer VPC, and a no-training data policy

✗ Cons

✗Scale tier jumps sharply from $19/month Pro to $399/month, leaving a large gap for mid-sized teams
✗Gmail, S3, and Web Crawler connectors are gated to the $399 Scale tier and above
✗Overage charges ($0.01 per 1,000 tokens, $0.10 per 1,000 queries) can add up for unpredictable workloads
✗As a newer memory-layer category, best practices and community tutorials are still maturing compared to established vector DBs
✗Enterprise features like SSO, forward-deployed engineers, and custom integrations require a custom-priced contract with no public pricing