如何降低AI Agent的Token成本？OpenViking文件系统范式详解

Q: OpenViking如何帮助降低AI Agent的Token成本？

OpenViking通过三级上下文加载机制，将内容自动处理为摘要、概览和详情三个层级，按需加载。实测相比传统向量数据库可减少91%的Token消耗。

Farewell to the fragmented storage of vector databases: Redefining Agent memory management with a filesystem paradigm.

一、为什么需要 OpenViking？

在构建 AI Agent 时，开发者常常面临这些痛点：

When building AI Agents, developers often face these pain points:

上下文碎片化：记忆散落在代码里，资源存在向量库，技能到处乱放

Context Fragmentation: Memories are scattered in code, resources reside in vector databases, and skills are placed haphazardly.
上下文爆炸：长时任务产生的海量信息，简单截断会丢失关键内容

Context Explosion: Massive information generated by long-term tasks, where simple truncation loses critical content.
检索效果差：传统 RAG 平铺存储，缺乏全局视角

Poor Retrieval Effectiveness: Traditional RAG uses flat storage, lacking a global perspective.
黑盒不可观测：检索链路不透明，出错难定位

Black Box, Unobservable: The retrieval pipeline is opaque, making it difficult to locate errors.
记忆无法迭代：只记录对话，不沉淀任务经验

Non-iterative Memory: Only records conversations, without accumulating task experience.

OpenViking 是一个专为 AI Agent 设计的上下文数据库，用“文件系统范式OpenViking采用的一种虚拟文件系统管理方法，用于组织上下文数据，包括memories、resources和workspace等目录结构。”统一管理记忆、资源、技能，让 Agent 越用越聪明。

OpenViking is a context database specifically designed for AI Agents. It uses a "filesystem paradigm" to unify the management of memories, resources, and skills, making the Agent smarter with use.

二、实现原理：五大核心创新

1. 文件系统范式OpenViking采用的一种虚拟文件系统管理方法，用于组织上下文数据，包括memories、resources和workspace等目录结构。

OpenViking 抛弃传统向量数据库的平铺模型，将所有上下文映射到 viking:// 协议的虚拟文件系统中：

OpenViking abandons the flat model of traditional vector databases, mapping all contexts to a virtual filesystem under the viking:// protocol:

viking://
├── resources/          # 项目文档、代码仓库
│   └── my_project/
├── user/               # 用户记忆
│   └── memories/
│       ├── preferences/    # 用户偏好
│       ├── entities/       # 关注的人/项目
│       └── events/         # 重要事件
└── agent/              # Agent 记忆
    ├── skills/             # 技能库
    ├── memories/
    │   ├── cases/          # 任务案例
    │   └── patterns/       # 可复用模式
    └── instructions/       # 指令集

优势：Agent 可以用 ls、find、read 等确定性操作精确定位信息，告别模糊匹配。

Advantage: Agents can use deterministic operations like ls, find, and read to precisely locate information, bidding farewell to fuzzy matching.

2. 三级上下文加载OpenViking 的优化机制，自动将内容处理为 L0（摘要，~100 tokens）、L1（概览，~2k tokens）、L2（详情，完整内容）三个层级，实现按需加载，大幅降低 Token 消耗。

自动将内容处理为三级结构，按需加载，大幅降低 Token 消耗：

Automatically processes content into a three-level structure, loading on demand, significantly reducing token consumption:

层级	名称	大小	用途
L0	摘要	~100 tokens	快速筛选
L1	概览	~2k tokens	决策参考
L2	详情	完整内容	深度阅读

效果：实测减少 91% 的 Token 消耗（vs LanceDB）。

Effect: Measured to reduce 91% of token consumption (vs. LanceDB).

3. 目录递归检索融合意图分析、向量检索和目录探索的检索策略，先锁定高分目录再精细探索内容

五步检索策略，像人类专家一样理解信息上下文：

A five-step retrieval strategy that understands information context like a human expert:

意图分析：根据会话生成多维度查询

Intent Analysis: Generates multi-dimensional queries based on the conversation.
初始定位：向量检索锁定高相关目录

Initial Positioning: Vector retrieval locks onto highly relevant directories.
细化探索：目录内二次检索

Refined Exploration: Secondary retrieval within the directory.
递归深入：子目录逐层探索

Recursive Deep Dive: Layer-by-layer exploration of subdirectories.
结果聚合：返回最相关的上下文

Result Aggregation: Returns the most relevant context.

4. 自动会话管理与记忆提取

会话结束时自动提取六类记忆：

Automatically extracts six types of memories at the end of a session:

类型	归属	说明	可合并
profile	user	用户身份属性	✅
preferences	user	用户偏好	✅
entities	user	关注的人/项目	✅
events	user	事件/决策	❌
cases	agent	问题+解决方案	❌
patterns	agent	可复用模式	✅

流程：消息 → LLM 提取 → 向量去重 → LLM 决策 → 写入存储

Process: Messages → LLM Extraction → Vector Deduplication → LLM Decision → Write to Storage

5. Viking URI 系统OpenViking 的统一资源标识系统，格式为 `viking://{scope}/{path}`，用于标识和管理不同作用域（如 resources, user, agent, session）下的数据，支持多租户隔离。

统一资源标识，支持多租户隔离：

Unified Resource Identifier, supporting multi-tenant isolation:

viking://{scope}/{path}

resources：独立资源，长期有效

resources: Independent resources, long-term validity.
user：用户数据，长期有效

user: User data, long-term validity.
agent：Agent 数据，长期有效

agent: Agent data, long-term validity.
session：会话数据，会话生命周期

session: Session data, valid for the session lifecycle.

三、主要功能

资源管理

add-resource
- 导入本地文件或 URL
  
  add-resource
  * Import local files or URLs.
export/import
- 导出/导入 .ovpack
  
  export/import
  * Export/Import .ovpack.
ls/tree/read
- 文件系统操作
  
  ls/tree/read
  * Filesystem operations.

语义检索

find
- 语义检索
  
  find
  * Semantic retrieval.
search
- 上下文感知检索
  
  search
  * Context-aware retrieval.
grep/glob
- 模式匹配
  
  grep/glob
  * Pattern matching.

会话管理

session new/list/get/delete
- 会话生命周期
  
  session new/list/get/delete
  * Session lifecycle management.
session add-message/commit
- 消息提交与记忆提取
  
  session add-message/commit
  * Message submission and memory extraction.

系统观测

system status/health
- 健康检查
  
  system status/health
  * Health checks.
observer queue/vikingdb/vlm
- 组件状态
  
  observer queue/vikingdb/vlm
  * Component status monitoring.

四、使用场景

1. 个人 AI 助手

长期维护用户偏好、习惯，跨会话保持一致性。

Long-term maintenance of user preferences and habits, ensuring consistency across sessions.

2. 开发团队协作

共享项目知识库，新成员快速上手，知识沉淀不流失。

Shared project knowledge base, enabling new members to get up to speed quickly, and preventing knowledge loss.

3. 企业知识管理

多租户隔离，不同团队数据独立，支持权限控制。

Multi-tenant isolation, independent data for different teams, supporting permission controls.

4. 自主 Agent 系统

任务经验自动积累，Agent 能力持续进化。

Automatic accumulation of task experience, enabling continuous evolution of Agent capabilities.

五、如何配合 OpenClaw 使用

OpenClaw 是一款流行的 AI Agent 框架，OpenViking 为其提供长期记忆后端。

OpenClaw is a popular AI Agent framework, and OpenViking provides it with a long-term memory backend.

快速开始

安装助手会自动检查环境、创建配置、部署插件。

The setup helper automatically checks the environment, creates configurations, and deploys the plugin.

cd /path/to/OpenViking
npx ./examples/openclaw-memory-plugin/setup-helper
openclaw gateway

核心功能

功能	说明
autoCapture	自动从对话中提取记忆
autoRecall	自动注入相关记忆到上下文
记忆去重	基于摘要/URI 自动去重
智能排序	偏好提升、时序提升、词法重叠

性能提升

基于 LoCoMo10 长对话数据集（1,540 例）：

Based on the LoCoMo10 long dialogue dataset (1,540 cases):

方案	任务完成率	输入 Token
OpenClaw (原生)	35.65%	24,611,530
OpenClaw + LanceDB	44.55%	51,574,530
OpenClaw + OpenViking	52.08%	4,264,396

提升：

Improvement:

相比原生：+46% 完成率，-91% Token 成本

Compared to native: +46% completion rate, -91% token cost.
相比 LanceDB：+17% 完成率，-92% Token 成本

Compared to LanceDB: +17% completion rate, -92% token cost.

配置示例

{
  "vlm": {
    "backend": "volcengine",
    "api_key": "<your-key>",
    "model": "doubao-seed-1-8-251228",
    "api_base": "https://ark.cn-beijing.volces.com/api/v3"
  },
  "embedding": {
    "dense": {
      "backend": "volcengine",
      "api_key": "<your-key>",
      "model": "doubao-embedding-vision-250615",
      "dimension": 1024
    }
  }
}

六、与 PageIndex 的对比

维度	OpenViking	PageIndex
定位	通用上下文数据库	专业文档检索系统
核心创新	文件系统 + 分层存储 + 会话记忆	无向量 + LLM 推理检索
存储模型	`viking://` 虚拟文件系统	章节层级树结构
检索方式	向量 + 目录递归 + 意图分析	纯 LLM 推理
记忆系统	6 类自动提取与去重	无
基础设施	需要向量数据库	无需向量数据库
Token 优化	L0/L1/L2 三级加载（-91%）	无
最佳场景	Agent 长期记忆、多类型资源	专业文档 QA（财报/法律）
准确性	52.08% 任务完成率	98.7% FinanceBench

如何选择？

选 OpenViking 如果：

Choose OpenViking if:

构建长期运行的 AI Agent

Building long-running AI Agents.
需要管理多种上下文（文档+代码+技能+记忆）

Need to manage multiple types of contexts (documents + code + skills + memories).
需要会话记忆自动迭代

Need automatic iteration of session memories.
需要生产级部署（多租户、权限控制）

Require production-grade deployment (multi-tenancy, permission control).

选 PageIndex 如果：

Choose PageIndex if:

主要做专业文档问答

Primarily doing professional document Q&A.
对准确性要求极高

Have extremely high requirements for accuracy.
不想维护向量数据库

Do not want to maintain a vector database.
文档有清晰目录结构

Documents have a clear table of contents structure.

七、快速开始

1. 安装

pip install openviking
# CLI 工具（可选）
curl -fsSL https://raw.githubusercontent.com/volcengine/OpenViking/main/crates/ov_cli/install.sh | bash

2. 配置

创建 ~/.openviking/ov.conf：

Create ~/.openviking/ov.conf:

{
  "storage": {
    "workspace": "/path/to/workspace"
  },
  "embedding": {
    "dense": {
      "provider": "volcengine",
      "api_key": "<key>",
      "model": "doubao-embedding-vision-250615",
      "dimension": 1024
    }
  },
  "vlm": {
    "provider": "volcengine",
    "api_key": "<key>",
    "model": "doubao-seed-1-8-251228"
  }
}

3. 启动与使用

# 启动服务
openviking-server
# 添加资源
ov add-resource https://github.com/volcengine/OpenViking --wait
# 语义检索
ov find "what is openviking"
# 查看文件树
ov tree viking://resources/OpenViking -L 2

八、总结

OpenViking 不仅是一个工具，更是 AI Agent 上下文管理的新范式：

OpenViking is not just a tool, but a new paradigm for AI Agent context management:

用文件系统思维统一管理记忆、资源、技能

Unifies the management of memories, resources, and skills using a filesystem mindset.
三级加载机制大幅降低 Token 成本

The three-level loading mechanism significantly reduces token costs.
目录递归检索融合意图分析、向量检索和目录探索的检索策略，先锁定高分目录再精细探索内容提升上下文理解能力

Directory recursive retrieval enhances context understanding capabilities.
自动记忆提取OpenViking 在会话结束时自动执行的流程，通过 LLM 提取六类记忆（如用户偏好、事件、任务案例、可复用模式等），并进行向量去重和决策后写入存储，使 Agent 能够积累经验。让 Agent 越用越聪明

Automatic memory extraction makes the Agent smarter with use.

如果你正在构建 AI Agent，不妨试试 OpenViking，让上下文管理不再是痛点。

If you are building an AI Agent, why not try OpenViking and make context management no longer a pain point.

常见问题（FAQ）

OpenViking如何帮助降低AI Agent的Token成本？

OpenViking通过三级上下文加载OpenViking 的优化机制，自动将内容处理为 L0（摘要，~100 tokens）、L1（概览，~2k tokens）、L2（详情，完整内容）三个层级，实现按需加载，大幅降低 Token 消耗。机制，将内容自动处理为摘要、概览和详情三个层级，按需加载。实测相比传统向量数据库可减少91%的Token消耗。

OpenViking的文件系统范式OpenViking采用的一种虚拟文件系统管理方法，用于组织上下文数据，包括memories、resources和workspace等目录结构。具体指什么？

它采用类似文件系统的虚拟目录结构（如viking://协议），统一管理记忆、资源和技能。Agent可使用ls、find等确定性操作精确定位信息，告别模糊匹配。

OpenViking在会话结束后如何管理记忆？

会话结束时自动提取六类记忆（如用户偏好、任务案例等），经过LLM提取、向量去重和决策后写入存储，实现记忆的迭代沉淀，让Agent越用越聪明。