VAC记忆系统在LoCoMo 2025基准测试中准确率如何达到80.1%？

Q: VAC记忆系统在LoCoMo 2025基准测试中的准确率是多少？

VAC记忆系统在LoCoMo 2025基准测试中实现了80.1%的准确率，在排行榜中排名第二，仅次于MemMachine的84.87%。

Q: VAC记忆系统的混合检索架构包含哪些技术？

系统采用混合检索架构，结合了MCA门控、FAISS语义搜索、BM25词法搜索和交叉编码器重排序技术，以实现高精度记忆检索。

Q: VAC记忆系统有哪两个版本？有什么区别？

提供LITE（开源学习架构）和FULL（编译生产版）两个版本。FULL版本使用生产代码在LoCoMo基准测试中运行。

在4.5个月内，从手机信号塔攀爬员到SOTA级AI记忆系统

全球最准确的开源大语言模型对话记忆系统

From Cell Tower Climber to SOTA AI Memory in 4.5 Months

The world's most accurate open-source conversational memory for LLM agents

📖 不可能的故事

没有计算机科学学位，没有编程背景。只是一个心怀梦想的勤杂工，以及终端里的Claude。

No CS degree. No programming background. Just a handyman with a dream and Claude in the terminal.

起点： 零编程知识，在TaskRabbit上安装橱柜
武器： 分期付款购买的RTX 4090 + 纯粹的痴迷
成果： 在LoCoMo基准测试上达到80%的SOTA水平
时间： 4.5个月，每天18小时

Started: Zero coding knowledge, installing closets on TaskRabbit

Weapon: RTX 4090 bought on installments + pure obsession

Result: SOTA 80% on LoCoMo

Time: 4.5 months of 18-hour days

这个代码库不仅仅是代码。它证明了 “不可能”只是一个起点。

This repository isn't just code. It's proof that impossible is a starting point.

🏆 数据不会说谎

官方LoCoMo 2025基准测试结果

使用GPT-4o-mini作为慷慨评判员，进行了100次测试运行

100 test runs with GPT-4o-mini generous judge

LoCoMo基准测试排行榜 - GPT-4o-mini (2025)


排名	系统	准确率	备注
🥇	MemMachine	84.87%	单跳：93.3%，多跳：80.5%，时序：72.6%
🥈	VAC Memory System	80.1%	100次验证运行，MCA + FAISS + BM25 + Cross-encoder
🥉	Letta (MemGPT)	74.0%	基于文件的语义搜索
4️⃣	Mem0 (图变体)	68.5%	相比OpenAI基线提升+26%
5️⃣	Memobase	75.78%	-
6️⃣	Zep	75.14%	-
7️⃣	Mem0 (默认)	66.88%	标准变体

Rank System Accuracy Notes

🥇 MemMachine 84.87% Single-hop: 93.3%, Multi-hop: 80.5%, Temporal: 72.6%

🥈 VAC Memory System 80.1% 100 validated runs, MCA + FAISS + BM25 + Cross-encoder

🥉 Letta (MemGPT) 74.0% File-based with semantic search

4️⃣ Mem0 (Graph variant) 68.5% +26% vs OpenAI baseline

5️⃣ Memobase 75.78% -

6️⃣ Zep 75.14% -

7️⃣ Mem0 (default) 66.88% Standard variant


Rank	System	Accuracy	Notes
🥇	MemMachine	84.87%	Single-hop: 93.3%, Multi-hop: 80.5%, Temporal: 72.6%
🥈	VAC Memory System	80.1%	100 validated runs, MCA + FAISS + BM25 + Cross-encoder
🥉	Letta (MemGPT)	74.0%	File-based with semantic search
4️⃣	Mem0 (Graph variant)	68.5%	+26% vs OpenAI baseline
5️⃣	Memobase	75.78%	-
6️⃣	Zep	75.14%	-
7️⃣	Mem0 (default)	66.88%	Standard variant

按对话细分 (10个对话 × 10个种子)


对话	问题数	平均准确率	峰值	洞察
0	152	87.5%	87.5%	🔥 最佳表现者
7	191	86.4%	87.2%	🔥 持续卓越
2	152	85.5%	86.2%	🔥 坚如磐石
1	81	80.2%	81.5%	✅ 高于基线
9	158	77.8%	79.1%	✅ 强大的回忆能力
3-8	736	76.7%	78.4%	✅ 可靠范围

Conv Questions Mean Accuracy Peak Insights

0 152 87.5% 87.5% 🔥 Best performer

7 191 86.4% 87.2% 🔥 Consistent excellence

2 152 85.5% 86.2% 🔥 Rock solid

1 81 80.2% 81.5% ✅ Above baseline

9 158 77.8% 79.1% ✅ Strong recall

3-8 736 76.7% 78.4% ✅ Reliable range

总计：评估了1,540个问题 → 平均准确率80.1%

Total: 1,540 questions evaluated → 80.1% mean accuracy

⚙️ 工作原理
flowchart LR
    A[🗣 Query] --> B[🧠 Preprocess]
    B --> C{🎯 MCA Gate}
    B --> D[🔍 FAISS]
    B --> E[📚 BM25]

    C --> F[🔀 Union]
    D --> F
    E --> F

    F --> G[⚖️ Rerank]
    G --> H[💬 GPT-4o-mini]
    H --> I[✅ Answer]

    style A fill:#e1f5fe
    style C fill:#fff3e0
    style G fill:#f3e5f5
    style I fill:#e8f5e9
🎓 两个版本：LITE（开源） vs FULL（编译版）

LITE版本 - 学习架构
# 开源Python实现 - 理解VAC如何工作
python mca_lite.py          # ~40行：关键词匹配
python pipeline_lite.py     # ~250行：4步流水线
# Open source Python implementation - understand how VAC works
python mca_lite.py          # ~40 lines: keyword matching
python pipeline_lite.py     # ~250 lines: 4-step pipeline
LITE版本展示了核心概念。

LITE achieves shows the core concepts.

FULL版本 - 在LoCoMo基准测试上使用生产代码
# 预编译的优化二进制文件 (Core/*.so)
./run_test.sh               # Linux/Mac
run_test.bat                # Windows
# Pre-compiled optimized binaries (Core/*.so)
./run_test.sh               # Linux/Mac
run_test.bat                # Windows
FULL版本通过所有优化实现了80.1%的准确率：

高级MCA（命名实体识别 + 日期解析）

BM25词汇搜索

交叉编码器重排序

查询扩展

FULL achieves 80.1% accuracy with all optimizations:

Advanced MCA (NER + date parsing)

BM25 lexical search

Cross-encoder reranking

Query expansion

🎯 秘密配方

MCA优先门控 🛡️ - 专有的实体/日期保护算法

混合检索 🔄 - FAISS (BGE-large) + BM25的完美结合

交叉编码器 ⚖️ - BAAI/bge-reranker-v2-m3，实现外科手术般的精确度

确定性 🎲 - 温度设为0，每次结果可复现

MCA-First Gate 🛡️ - Proprietary entity/date protection algorithm

Hybrid Retrieval 🔄 - FAISS (BGE-large) + BM25 perfect union

Cross-Encoder ⚖️ - BAAI/bge-reranker-v2-m3 for surgical precision

Deterministic 🎲 - Temperature 0, reproducible every time

📊 性能指标

指标数值说明

⚡ 速度 2.5秒/问题 每个问题的处理时间

💰 成本 <$0.10 / 百万令牌 每百万令牌的处理成本

🎯 召回率 94-100% 真实答案覆盖率

🔒 隔离性 100% 对话完全分离

🧪 可复现性 100% 每个结果均可验证
Metric Value Description

⚡ Speed 2.5 seconds per question Processing time per question

💰 Cost <$0.10 per million tokens Cost per million tokens processed

🎯 Recall 94-100% Ground truth coverage

🔒 Isolation 100% Complete conversation separation

🧪 Reproducible 100% Every result verifiable

🚀 快速开始 (30秒)

先决条件
# 1. 安装 Python 3.10+
# 2. 支持CUDA的GPU (8GB+ 显存)
# 3. 安装 Ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen2.5:14b
# 1. Install Python 3.10+
# 2. CUDA-capable GPU (8GB+ VRAM)
# 3. Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen2.5:14b
运行系统
🐧 Linux
git clone https://github.com/vac-architector/VAC-Memory-System.git
cd VAC-Memory-System
export OPENAI_API_KEY="sk-..."
./run_test.sh
git clone https://github.com/vac-architector/VAC-Memory-System.git
cd VAC-Memory-System
export OPENAI_API_KEY="sk-..."
./run_test.sh
🪟 Windows
git clone https://github.com/vac-architector/VAC-Memory-System.git
cd VAC-Memory-System
set OPENAI_API_KEY=sk-...
run_test.bat
git clone https://github.com/vac-architector/VAC-Memory-System.git
cd VAC-Memory-System
set OPENAI_API_KEY=sk-...
run_test.bat
验证结果
# 运行官方评判器
python3 Core/gpt_official_generous_judge_from_mem0.py results/vac_v1_*.json

# 检查准确率
cat results/*_generous_judged.json | grep "accuracy"
# Run the official judge
python3 Core/gpt_official_generous_judge_from_mem0.py results/vac_v1_*.json

# Check accuracy
cat results/*_generous_judged.json | grep "accuracy"
📁 仓库结构
VAC-Memory-System/
├── 🧠 Core/                    # 编译后的流水线 (.so) + 评判器

## 常见问题（FAQ）

### VAC记忆系统在LoCoMo 2025基准测试中的准确率是多少？

VAC记忆系统在LoCoMo 2025基准测试中实现了80.1%的准确率，在排行榜中排名第二，仅次于MemMachine的84.87%。

### VAC记忆系统的混合检索架构包含哪些技术？

系统采用混合检索架构，结合了MCA门控、FAISS语义搜索、BM25词法搜索和交叉编码器重排序技术，以实现高精度记忆检索。

### VAC记忆系统有哪两个版本？有什么区别？

提供LITE（开源学习架构）和FULL（编译生产版）两个版本。FULL版本使用生产代码在LoCoMo基准测试中运行。


Conv	Questions	Mean Accuracy	Peak	Insights
0	152	87.5%	87.5%	🔥 Best performer
7	191	86.4%	87.2%	🔥 Consistent excellence
2	152	85.5%	86.2%	🔥 Rock solid
1	81	80.2%	81.5%	✅ Above baseline
9	158	77.8%	79.1%	✅ Strong recall
3-8	736	76.7%	78.4%	✅ Reliable range


指标	数值	说明
⚡ 速度	2.5秒/问题	每个问题的处理时间
💰 成本	<$0.10 / 百万令牌	每百万令牌的处理成本
🎯 召回率	94-100%	真实答案覆盖率
🔒 隔离性	100%	对话完全分离
🧪 可复现性	100%	每个结果均可验证


Metric	Value	Description
⚡ Speed	2.5 seconds per question	Processing time per question
💰 Cost	<$0.10 per million tokens	Cost per million tokens processed
🎯 Recall	94-100%	Ground truth coverage
🔒 Isolation	100%	Complete conversation separation
🧪 Reproducible	100%	Every result verifiable

AI Summary (BLUF)

📖 不可能的故事

🏆 数据不会说谎

官方LoCoMo 2025基准测试结果

LoCoMo基准测试排行榜 - GPT-4o-mini (2025)

按对话细分 (10个对话 × 10个种子)

⚙️ 工作原理

🎓 两个版本：LITE（开源） vs FULL（编译版）

LITE版本 - 学习架构

FULL版本 - 在LoCoMo基准测试上使用生产代码

🎯 秘密配方

📊 性能指标

🚀 快速开始 (30秒)

先决条件

运行系统

验证结果

📁 仓库结构