Enterprise agent memory built on temporal Context Graphs (Graphiti) with millisecond retrieval, SOC 2 Type II, and HIPAA BAA.
Enterprise agent memory built on temporal Context Graphs (Graphiti) with millisecond retrieval, SOC 2 Type II, and HIPAA BAA.
Zep is an enterprise-focused memory layer for AI agents, structured around temporal Context Graphs rather than a flat vector store of past messages. The underlying engine, Graphiti, ingests every signal an agent touches — chat history, business data, user attributes, webhook events — and builds a graph of entities, relationships, and facts that track not only what was said but when it became true, when it stopped being true, and who it applies to. At query time, Zep assembles a focused context bundle in roughly 200 milliseconds, which is the kind of latency that lets memory survive in front-of-user agent flows where anything above 500ms feels broken. The product positioning in 2026 has shifted from "agent memory" toward the Context Lake — millions of governed context graphs served as one system, with access control, retention, provenance, and audit baked in.
Pricing changed in 2026 to a simpler credit-based model. Episodes (any chat message, JSON payload, or text block sent to Zep) cost 1 credit per 350 bytes, rounded up — so a 640-byte Episode is 2 credits and a 1,200-byte Episode is 4 credits. Storage, retrieval, memory, and users are unmetered; you only pay for ingestion. Free gives 1,000 credits/month with no rollover, two projects, and variable rate limits. Flex at $125/month includes 50,000 credits, auto top-up at 20%, 30-day rollover, 600 RPM, and five projects. Flex Plus at $375/month bumps that to 200,000 credits, 1,000 RPM, 10 projects, webhooks, analytics, custom extraction instructions, and seven-day API log retention. Enterprise unlocks SOC 2 Type II controls under contract, a HIPAA BAA, one-year audit/API log retention, guaranteed rate limits, and Cloud / Cloud + BYOK / BYOC deployment options.
Zep speaks REST plus official SDKs in Python, TypeScript, and Go, integrates with LangChain, LlamaIndex, Vercel AI SDK, and Mastra, and exposes its memory graph through an MCP server so MCP-aware clients (Claude Desktop, Cursor, OpenAI Agents SDK) can read and write user memory directly. The platform is SOC 2 Type II certified, signs HIPAA BAAs on Enterprise, and signs DPAs with EU customers — which makes it one of the few agent-memory tools that survives a serious procurement process at a regulated company. The S&P Global Market Intelligence reference is the public case study most often cited in that context.
The best fits are customer support copilots that need durable account history, sales agents tracking long relationships, healthcare or financial assistants where auditability is not optional, and multi-agent systems that need a shared semantic memory governed at the org level. The risks: credit-based billing is hard to predict until you measure real Episode sizes in production, the temporal graph adds modeling overhead compared to dumping conversations into a vector DB, and the most interesting governance features (audit, retention, BYOK) live behind the Enterprise plan.
Was this helpful?
Zep delivers sophisticated context engineering capabilities that go far beyond simple conversation memory. Users praise the temporal knowledge graph approach for capturing entity relationships and fact evolution over time. The <200ms retrieval latency and framework-agnostic integration make it suitable for real-time applications. Enterprise features including SOC2 and HIPAA compliance address security requirements. Some users note the credit-based pricing can become expensive at scale, and the graph-based architecture requires more setup than simple memory stores.
Builds evolving knowledge graphs from conversations and business data, tracking how entities and relationships change over time. Automatically invalidates outdated facts while preserving provenance, ensuring agents access current, accurate information.
Use Case:
Customer support agent understands that a user's payment method was updated last week, invalidating previous 'expired card' status while maintaining history of the resolution process.
Automatically ingests and correlates data from chat history, CRM systems, JSON business data, and documents into a single context graph. Retrieves and formats relevant information for LLM consumption in one API call.
Use Case:
Sales agent accessing prospect's conversation history, CRM data, and product interaction logs to provide personalized recommendations based on complete customer journey.
Delivers assembled context with <200ms P95 latency using optimized graph traversal and caching. Multiple configuration options balance accuracy, speed, and token efficiency for different use cases.
Use Case:
Voice agent providing immediate, personalized responses during live customer calls without noticeable delays, accessing complete customer context in real-time.
Combines relationship-aware retrieval with traditional RAG, understanding connections between entities to surface relevant context. Supports custom entity types and relationship models for domain-specific knowledge.
Use Case:
Healthcare agent understanding patient's medication history, doctor relationships, and treatment outcomes to provide contextually appropriate health guidance.
Pre-formatted context blocks optimized for different LLM prompting strategies. Allows fine-tuned control over how entities, relationships, and facts are presented to agents.
Use Case:
E-commerce agent receiving customer context formatted with purchase history, browsing patterns, and preference summaries tailored for product recommendation workflows.
SOC2 Type 2 certified with HIPAA BAA support, multiple deployment models including BYOK, BYOM, and BYOC. Audit logs, guaranteed SLAs, and data residency controls for regulated industries.
Use Case:
Healthcare organization deploying AI patient assistants with full HIPAA compliance, encrypted data processing, and audit trails for regulatory requirements.
$0
$125/mo
$375/mo
Custom
Ready to get started with Zep?
View Pricing Options →Zep works with these platforms and services:
We believe in transparent reviews. Here's what Zep doesn't handle well:
Weekly insights on the latest AI tools, features, and trends delivered to your inbox.
AI agent memory
Memory infrastructure for AI agents and applications, available as an open-source framework and managed platform.
AI Memory & Search
Letta is the open-source successor to MemGPT — a stateful agent platform with persistent memory, tool use, and a visual Agent Development Environment.
AI Memory & Search
LangChain memory primitives for long-horizon agent workflows.
AI Memory & Search
Supermemory is the memory and context layer for AI agents — a graph-based memory API with extractors, connectors, and retrieval for personal apps and enterprise stacks.
No reviews yet. Be the first to share your experience!
Get started with Zep and see if it's the right fit for your needs.
Get Started →Take our 60-second quiz to get personalized tool recommendations
Find Your Perfect AI Stack →Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.
Browse Agent Templates →Everything builders need to know about vector databases — how they work under the hood, which one to choose (with real pricing and benchmarks), and how to implement them in RAG pipelines, agent memory systems, and multi-agent architectures.
AI agents without memory restart from zero every conversation, wasting time and money. Here's how the three types of agent memory work, why they matter for your business, and which tools actually deliver results in 2026.
The 10 trends reshaping the AI agent tooling landscape in 2026 — from MCP adoption to memory-native architectures, voice agents, and the cost optimization wave. With real tools leading each trend and current market data.