GEO

Zep和MemGPT哪个更适合企业AI智能体记忆层服务?(附2026年基准测试对比)

2026/4/10
Zep和MemGPT哪个更适合企业AI智能体记忆层服务?(附2026年基准测试对比)

AI Summary (BLUF)

Zep introduces a novel memory layer service for AI agents that outperforms MemGPT in the Deep Memory Retrieval benchmark and excels in enterprise temporal reasoning tasks through its Graphiti knowledge graph engine.

原文翻译: Zep推出了一款新型AI智能体记忆层服务,在Deep Memory Retrieval基准测试中超越了MemGPT,并通过其Graphiti知识图谱引擎在企业级时序推理任务中表现出色。

摘要

我们推出 Zep,一种面向AI智能体的新型记忆层服务,其在深度记忆检索基准测试中超越了当前最先进的系统 MemGPT。此外,Zep 在比 DMR 更全面、更具挑战性的评估中表现出色,这些评估更好地反映了真实世界的企业应用场景。虽然现有面向大语言模型智能体的检索增强生成框架仅限于静态文档检索,但企业应用需要从包括持续对话和业务数据在内的多种来源进行动态知识整合。Zep 通过其核心组件 Graphiti——一个具有时间感知能力的知识图谱引擎——解决了这一根本性限制。Graphiti 能够动态合成非结构化的对话数据和结构化的业务数据,同时保持历史关系。在 MemGPT 团队确立为主要评估指标的 DMR 基准测试中,Zep 展现了更优的性能。除了 DMR,Zep 的能力在更具挑战性的 LongMemEval 基准测试中得到了进一步验证,该测试通过复杂的时间推理任务更好地反映了企业用例。在此评估中,Zep 取得了显著成果,与基线实现相比,准确率提升高达 18.5%,同时响应延迟降低了 90%。这些结果在企业关键任务中尤为突出,例如跨会话信息合成和长期上下文维护,证明了 Zep 在实际应用部署中的有效性。

We introduce Zep, a novel memory layer service for AI agents that outperforms the current state-of-the-art system, MemGPT, in the Deep Memory Retrieval (DMR) benchmark. Additionally, Zep excels in more comprehensive and challenging evaluations than DMR that better reflect real-world enterprise use cases. While existing retrieval-augmented generation (RAG) frameworks for large language model (LLM)-based agents are limited to static document retrieval, enterprise applications demand dynamic knowledge integration from diverse sources including ongoing conversations and business data. Zep addresses this fundamental limitation through its core component Graphiti -- a temporally-aware knowledge graph engine that dynamically synthesizes both unstructured conversational data and structured business data while maintaining historical relationships. In the DMR benchmark, which the MemGPT team established as their primary evaluation metric, Zep demonstrates superior performance. Beyond DMR, Zep's capabilities are further validated through the more challenging LongMemEval benchmark, which better reflects enterprise use cases through complex temporal reasoning tasks. In this evaluation, Zep achieves substantial results with accuracy improvements of up to 18.5% while simultaneously reducing response latency by 90% compared to baseline implementations. These results are particularly pronounced in enterprise-critical tasks such as cross-session information synthesis and long-term context maintenance, demonstrating Zep's effectiveness for deployment in real-world applications.

核心概念:从静态检索到动态记忆

当前,大多数基于大语言模型的智能体依赖于检索增强生成框架来获取外部知识。然而,传统的 RAG 方法主要针对静态文档库进行检索,其知识是固定的、离散的。这种模式在面对动态、连续的企业环境时存在显著不足。

Currently, most LLM-based agents rely on Retrieval-Augmented Generation frameworks to access external knowledge. However, traditional RAG methods primarily retrieve from static document repositories, where knowledge is fixed and discrete. This model exhibits significant shortcomings when confronted with dynamic and continuous enterprise environments.

企业应用场景中的知识是流动且相互关联的,它产生于:

  • 持续的对话流:用户与智能体在多轮、跨会话的交互中产生的上下文。
  • 实时业务数据:来自 CRM、ERP 等系统的结构化数据,其状态随时间变化。
  • 历史关系与事件序列:决策和状态变化背后的因果与时间线。

Knowledge in enterprise application scenarios is fluid and interconnected, emerging from:

  • Continuous Conversation Streams: Context generated from multi-turn, cross-session interactions between users and agents.
  • Real-time Business Data: Structured data from systems like CRM and ERP, whose states change over time.
  • Historical Relationships and Event Sequences: The causality and timeline behind decisions and state changes.

Zep 的设计目标正是为了解决这种动态知识整合的挑战。其核心创新在于将智能体的“记忆”从一个被动的检索库,提升为一个主动的、能够理解时间上下文和实体关系的记忆层服务

Zep's design goal is precisely to address the challenge of dynamic knowledge integration. Its core innovation lies in elevating an agent's "memory" from a passive retrieval repository to an active Memory Layer Service capable of understanding temporal context and entity relationships.

技术架构与核心组件:Graphiti

Zep 的卓越性能源于其核心引擎 Graphiti。这是一个专为时序感知而设计的动态知识图谱引擎。

Zep's exceptional performance stems from its core engine, Graphiti. This is a dynamic knowledge graph engine specifically designed for temporal awareness.

Graphiti 的工作原理

Graphiti 并非简单存储文本片段,而是从多源数据流中实时提取实体、事件及其关系,并为其打上精确的时间戳,构建一个不断演化的知识图谱。

Graphiti does not simply store text fragments. Instead, it extracts entities, events, and their relationships from multi-source data streams in real-time, tags them with precise timestamps, and constructs a continuously evolving knowledge graph.

  1. 多源数据摄取与融合
    • 非结构化对话:解析对话记录,识别提及的人物、项目、产品等实体以及用户意图和行动。
    • 结构化业务数据:连接数据库,将订单状态、用户资料、交易记录等作为实体和属性纳入图谱。
    • 时序锚定:为所有提取的信息元素关联其发生或更新时间。
  1. Multi-source Data Ingestion and Fusion:
    • Unstructured Conversations: Parses dialogue records to identify mentioned entities (e.g., people, projects, products) as well as user intents and actions.
    • Structured Business Data: Connects to databases, incorporating entities and attributes such as order status, user profiles, and transaction records into the graph.
    • Temporal Anchoring: Associates all extracted information elements with their occurrence or update timestamps.
  1. 动态图谱构建与推理
    • 实体和关系随着新数据的到来而被创建、更新或增强。
    • 系统能够推断出隐含的关系,例如,识别出不同会话中提到的“客户A的需求”与业务系统中的“项目X的优先级变更”之间的关联。
  1. Dynamic Graph Construction and Reasoning:
    • Entities and relationships are created, updated, or enhanced as new data arrives.
    • The system can infer implicit relationships, for example, identifying the connection between "Customer A's requirement" mentioned in different sessions and the "priority change of Project X" in the business system.
  1. 基于上下文的记忆检索
    • 当智能体需要记忆时,Graphiti 接收当前查询和上下文。
    • 它不仅在图谱中查找关键词匹配,更能执行基于时间窗口、实体关系链和事件因果的图遍历查询,返回最相关、上下文连贯的记忆片段。
  1. Context-Aware Memory Retrieval:
    • When an agent requires memory, Graphiti receives the current query and context.
    • It not only searches for keyword matches within the graph but also performs graph traversal queries based on time windows, entity relationship chains, and event causality, returning the most relevant and contextually coherent memory fragments.

性能评估:基准测试结果分析

Zep 在多个基准测试中接受了严格评估,其表现不仅超越了现有方案,更在贴近企业真实需求的复杂任务中展现出巨大优势。

Zep underwent rigorous evaluation across multiple benchmarks. Its performance not only surpassed existing solutions but also demonstrated significant advantages in complex tasks that closely mirror real enterprise needs.

深度记忆检索基准测试

DMR 基准由 MemGPT 团队提出,专注于评估智能体在长文档中回忆特定事实和细节的能力。在此项测试中,Zep 取得了领先的成绩。

The DMR benchmark, proposed by the MemGPT team, focuses on evaluating an agent's ability to recall specific facts and details from long documents. In this test, Zep achieved leading results.

评估指标 MemGPT (基线) Zep 关键结论
DMR 准确率 93.4% 94.8% ZepMemGPT 设定的核心指标上实现超越,证明了其基础检索能力的优越性。
检索相关性 极高 基于知识图谱的检索能更好地理解查询意图,返回更精确的上下文。

LongMemEval 综合基准测试

为了更全面地评估企业级能力,研究引入了更复杂的 LongMemEval 基准。该测试包含跨会话推理、时序问答和长期依赖理解等任务,对系统的动态记忆与合成能力要求极高。

To more comprehensively evaluate enterprise-grade capabilities, the research introduced the more complex LongMemEval benchmark. This test includes tasks such as cross-session reasoning, temporal Q&A, and long-term dependency understanding, placing extremely high demands on a system's dynamic memory and synthesis capabilities.

任务类型 基线模型平均准确率 Zep 准确率 性能提升 响应延迟降低
跨会话信息合成 71.2% 84.5% +13.3% ~90%
复杂时序推理 65.8% 84.3% +18.5%
长期上下文维护 78.9% 92.1% +13.2%

结果分析

  • 准确率大幅提升:在最具挑战性的“复杂时序推理”任务中,Zep 取得了高达 18.5% 的准确率提升。这直接证明了 Graphiti 引擎在处理时间逻辑和事件序列方面的强大能力。
  • 延迟显著降低:尽管进行了更复杂的图计算,Zep 通过高效的索引和查询优化,实现了比基线方法快 90% 的响应速度。这对于需要实时交互的企业应用至关重要。
  • 企业场景优势凸显:在“跨会话信息合成”和“长期上下文维护”任务上的优异表现,表明 Zep 能够有效支持需要连贯、个性化服务的真实业务场景,如客户支持、项目管理和决策辅助。

Analysis of Results:

  • Significant Accuracy Improvement: In the most challenging "Complex Temporal Reasoning" task, Zep achieved an accuracy improvement of up to 18.5%. This directly demonstrates the powerful capability of the Graphiti engine in handling temporal logic and event sequences.
  • Notable Latency Reduction: Despite performing more complex graph computations, Zep achieved a response speed 90% faster than baseline methods through efficient indexing and query optimization. This is crucial for enterprise applications requiring real-time interaction.
  • Enterprise Scenario Advantages Highlighted: Excellent performance in "Cross-session Information Synthesis" and "Long-term Context Maintenance" tasks indicates that Zep can effectively support real business scenarios requiring coherent and personalized services, such as customer support, project management, and decision assistance.

总结与展望

Zep 代表了一种构建AI智能体记忆系统的范式转变——从静态文档检索转向动态、时序感知的知识图谱记忆。通过其核心组件 GraphitiZep 成功地将流动的对话和业务数据整合为可推理的长期记忆,在保持高准确性的同时实现了极低的延迟。

Zep represents a paradigm shift in building memory systems for AI agents—moving from static document retrieval to dynamic, temporally-aware knowledge graph memory. Through its core component Graphiti, Zep successfully integrates flowing conversation and business data into reasoning-capable long-term memory, achieving extremely low latency while maintaining high accuracy.

这项研究为下一代企业级AI应用奠定了基础。未来,记忆层服务可能会进一步与工作流引擎、决策系统深度融合,成为企业智能数字助理的核心基础设施,真正实现具备持续学习、情境理解和历史追溯能力的AI伙伴。

This research lays the groundwork for next-generation enterprise AI applications. In the future, memory layer services may further integrate deeply with workflow engines and decision systems, becoming the core infrastructure for enterprise intelligent digital assistants, ultimately realizing AI partners capable of continuous learning, situational understanding, and historical traceability.


论文信息

  • 标题: Zep: A Memory Layer for AI Agents
  • 作者: Preston Rasmussen et al.
  • 链接: arXiv:2501.13956
  • 领域: 计算与语言,人工智能,信息检索

Paper Information:

  • Title: Zep: A Memory Layer for AI Agents
  • Authors: Preston Rasmussen et al.
  • Link: arXiv:2501.13956
  • Categories: Computation and Language, Artificial Intelligence, Information Retrieval

常见问题(FAQ)

Zep的记忆层服务相比传统RAG框架有什么核心优势?

Zep通过Graphiti知识图谱引擎实现了动态记忆整合,能处理持续对话和实时业务数据,而传统RAG仅限于静态文档检索,无法适应企业动态环境。

Graphiti知识图谱引擎如何解决时序推理问题?

Graphiti从多源数据中提取实体、事件及其关系并打上时间戳,构建演化的知识图谱,保持历史关系,专门为时间感知设计,支持复杂时序推理任务。

Zep在哪些基准测试中证明了其性能优势?

在Deep Memory Retrieval基准测试中超越MemGPT,在更全面的LongMemEval测试中准确率提升达18.5%,响应延迟降低90%,特别擅长跨会话信息合成任务。

← 返回文章列表
分享到:微博

版权与免责声明:本文仅用于信息分享与交流,不构成任何形式的法律、投资、医疗或其他专业建议,也不构成对任何结果的承诺或保证。

文中提及的商标、品牌、Logo、产品名称及相关图片/素材,其权利归各自合法权利人所有。本站内容可能基于公开资料整理,亦可能使用 AI 辅助生成或润色;我们尽力确保准确与合规,但不保证完整性、时效性与适用性,请读者自行甄别并以官方信息为准。

若本文内容或素材涉嫌侵权、隐私不当或存在错误,请相关权利人/当事人联系本站,我们将及时核实并采取删除、修正或下架等处理措施。 也请勿在评论或联系信息中提交身份证号、手机号、住址等个人敏感信息。