OpenFable和传统检索引擎相比哪个更高效？（附FABLE算法解析）

An open-source retrieval engine implementing FABLE (Forest-Based Adaptive Bi-Path LLM-Enhanced Retrieval). OpenFable accepts documents as raw text, builds LLM-enhanced semantic forest indexes, and retrieves relevant content through bi-path retrieval with adaptive budget control.

一个实现了 FABLE（基于森林的自适应双路径 LLM 增强检索）的开源检索引擎。OpenFable 接收原始文本作为文档，构建 LLM 增强的语义森林索引，并通过具有自适应预算控制的双路径检索来获取相关内容。

快速开始

export OPENAI_API_KEY=sk-...
docker compose up -d

Connect any MCP client to http://localhost:8000/v1/mcp/sse — Claude Desktop, Cursor, or your own agent:

将任何 MCP 客户端连接到 http://localhost:8000/v1/mcp/sse — Claude Desktop、Cursor 或您自己的代理：

pip install mcp-use langchain-openai

http://localhost:8000/v1/mcp/sse"}}
})
agent = MCPAgent(llm=ChatOpenAI(), client=client, max_steps=10)
print(await agent.run("Ingest this document: The Eye of Kurak was discovered by "
"archaeologist Lena Voss in 1923 beneath the ruins of Kurak."))
print(await agent.run("Search the indexed documents: Who discovered the Eye of Kurak?"))

asyncio.run(main())">

import asyncio
from langchain_openai import ChatOpenAI
from mcp_use import MCPAgent, MCPClient
async def main():
    client = MCPClient.from_dict({
        "mcpServers": {"openfable": {"url": "http://localhost:8000/v1/mcp/sse"}}
    })
    agent = MCPAgent(llm=ChatOpenAI(), client=client, max_steps=10)
    print(await agent.run("Ingest this document: The Eye of Kurak was discovered by "
                          "archaeologist Lena Voss in 1923 beneath the ruins of Kurak."))
    print(await agent.run("Search the indexed documents: Who discovered the Eye of Kurak?"))
asyncio.run(main())

A REST API is also available — see API Reference or the OpenAPI docs at http://localhost:8000/docs.

同时提供 REST API — 请参阅 API 参考或在 http://localhost:8000/docs 查看 OpenAPI 文档。

为什么选择 FABLE？

Most RAG systems chunk documents into flat segments and retrieve by vector similarity. This works for simple queries but breaks down when:

大多数 RAG 系统将文档分割成扁平片段，并通过向量相似性进行检索。这对于简单查询有效，但在以下情况下会失效：

A question spans multiple sections of a document (问题跨越文档的多个部分)
The answer requires understanding how sections relate to each other (答案需要理解各部分之间的关系)
You need to control how many tokens you send to the LLM (您需要控制发送给 LLM 的令牌数量)
Relevant content is buried in a subsection that doesn't match the query's surface-level keywords (相关内容隐藏在子章节中，且与查询的表面关键词不匹配)

FABLE 与其他方法的对比


	固定大小分块	语义分块	RAPTOR	FABLE
分块边界	令牌计数	嵌入相似性	令牌计数	LLM 识别的话语断点
索引结构	扁平	扁平	自底向上树（聚类）	自顶向下树（LLM 生成的层次结构）
检索方式	仅向量	仅向量	跨树层的向量检索	双路径：LLM 推理 + 带树传播的向量检索
预算控制	无	无	无	带自适应文档/节点路由的令牌预算

FABLE solves this by building a semantic forest -- a tree structure where each document becomes a hierarchy of nodes (root, sections, subsections, leaves). Retrieval then uses two complementary paths at each level:

FABLE 通过构建一个语义森林来解决这个问题——这是一种树形结构，其中每个文档都变成一个节点层次结构（根节点、章节、子章节、叶子节点）。然后，检索在每一层使用两个互补的路径：

LLM-guided path -- an LLM reasons about which documents and subtrees are relevant based on their summaries and table-of-contents structure (LLM 引导路径 — LLM 根据摘要和目录结构推理哪些文档和子树是相关的)
Vector path -- embedding similarity search over the same tree nodes, with structure-aware score propagation (TreeExpansion) (向量路径 — 对相同的树节点进行嵌入相似性搜索，并具有结构感知的分数传播（TreeExpansion）)

Results from both paths are fused, deduplicated, and trimmed to fit within a token budget you specify.

两条路径的结果被融合、去重并修剪，以适应您指定的令牌预算。

工作原理

文档摄取

When you POST a document, OpenFable:

当您 POST 一个文档时，OpenFable 会执行以下操作：

Semantic chunking -- an LLM identifies discourse boundaries and splits the text into coherent chunks (not fixed-size windows) (语义分块 — LLM 识别话语边界，并将文本分割成连贯的块（而非固定大小的窗口）)
Tree construction -- chunks are organized into a hierarchical tree. The LLM generates summaries for internal nodes, creating a table-of-contents-like structure (树构建 — 块被组织成层次树。LLM 为内部节点生成摘要，创建类似目录的结构)
Multi-granularity embedding -- every node (root, section, subsection, leaf) gets a BGE-M3 embedding. Internal nodes embed their toc_path + summary; leaves embed their raw content (多粒度嵌入 — 每个节点（根、章节、子章节、叶子）都获得一个 BGE-M3 嵌入。内部节点嵌入其 toc_path + summary；叶子节点嵌入其原始内容)
Indexing -- embeddings are stored in pgvector with HNSW indexes for fast similarity search (索引 — 嵌入存储在 pgvector 中，并使用 HNSW 索引以实现快速相似性搜索)

检索过程

When you POST a query with a token_budget:

当您 POST 一个带有 token_budget 的查询时：

Document level -- which documents matter?

文档级别 — 哪些文档重要？

LLMselect: the LLM sees shallow tree nodes (toc paths + summaries) and scores document relevance (LLMselect：LLM 查看浅层树节点（目录路径 + 摘要）并对文档相关性进行评分)
Vector top-K: cosine similarity search over internal node embeddings, aggregated to document level (向量 top-K：对内部节点嵌入进行余弦相似性搜索，聚合到文档级别)
Results are fused (union, max-score) (结果被融合（并集，最大分数）)

Budget routing -- if the fused documents fit within your token budget, their full content is returned. If not, retrieval drills down to node level.

预算路由 — 如果融合的文档适合您的令牌预算，则返回其完整内容。否则，检索将深入到节点级别。

Node level -- which chunks matter?

节点级别 — 哪些块重要？

LLMnavigate: the LLM sees the full tree hierarchy and selects relevant subtree roots (LLMnavigate：LLM 查看完整的树层次结构并选择相关的子树根节点)
TreeExpansion: structure-aware scoring using S(v) = 1/3[S_sim + S_inh + S_child] -- similarity with depth decay, ancestor inheritance, and child aggregation propagate relevance through tree edges (TreeExpansion：使用 S(v) = 1/3[S_sim + S_inh + S_child] 进行结构感知评分 — 具有深度衰减的相似性、祖先继承和子节点聚合通过树边传播相关性)
Results are fused with LLM-guided nodes getting priority, then greedily selected up to the token budget (结果与 LLM 引导的节点融合，LLM 引导的节点优先，然后贪婪地选择直到达到令牌预算)

The result: you get the most relevant chunks, in document order, within your token budget -- using both LLM reasoning and structural context, not just embedding distance.

结果：您在令牌预算内，按照文档顺序，获得最相关的块 — 同时利用了 LLM 推理和结构上下文，而不仅仅是嵌入距离。

系统架构

flowchart LR
client([Developer / RAG App])
api["OpenFable API<br/>FastAPI + Python 3.12"]
db["PostgreSQL 17<br/>+ pgvector"]
embeddings["Embeddings<br/>TEI / OpenAI"]
llm["LLM Provider<br/>Anthropic / OpenAI / Ollama"]

client -- "REST /v1/api" --&gt; api
client -- "MCP /v1/mcp" --&gt; api
api -- "SQLAlchemy" --&gt; db
api -- "/v1/embeddings" --&gt; embeddings
api -- "LiteLLM" --&gt; llm

配置

All settings are controlled by environment variables (no .env file).

所有设置均通过环境变量控制（无需 .env 文件）。

Set your LLM provider's API key directly — OpenFable uses LiteLLM and reads the standard provider variables:

直接设置您的 LLM 提供商的 API 密钥 — OpenFable 使用 LiteLLM 并读取标准的提供商变量：


提供商	环境变量	模型示例
OpenAI	`OPENAI_API_KEY`	`gpt-5.4` (默认)
Anthropic	`ANTHROPIC_API_KEY`	`anthropic/claude-sonnet-4-5-20250514`
Ollama	（无 — 设置 `OPENFABLE_LITELLM_BASE_URL`）	`ollama/qwen3:8b`

OpenFable-specific settings use the OPENFABLE_ prefix:

OpenFable 特定的设置使用 OPENFABLE_ 前缀：

变量默认值描述

OPENFABLE_DATABASE_URL postgresql://openfable:openfable@db:5432/openfable PostgreSQL 连接字符串

变量	默认值	描述
`OPENFABLE_DATABASE_URL`	`postgresql://openfable:openfable@db:5432/openfable`	PostgreSQL 连接字符串
`OPENFABLE_LITELLM_MODEL`	`gpt-5.4`	L 常见问题（FAQ） OpenFable 与其他检索引擎相比有什么优势？ OpenFable 实现了 FABLE 方法，通过构建 LLM 增强的语义森林索引和自适应预算控制的双路径检索，相比传统方法能更精准地获取相关内容，且是开源项目。如何快速开始使用 OpenFable？设置 OPENAI_API_KEY 后运行 docker compose up -d，将 MCP 客户端连接到 http://localhost:8000/v1/mcp/sse 即可开始使用，支持 Claude Desktop、Cursor 等客户端。 OpenFable 的检索过程是如何工作的？系统接收原始文本构建语义森林索引，通过双路径检索机制结合自适应预算控制来获取最相关内容，支持文档摄取和查询检索完整流程。标签生成式引擎优化结构化数据 llms.txt AI大模型人工智能 ← 返回文章列表分享到：微博下一篇 Tiger Data是什么公司？它的网站页尾技术分析（SVG、SEO与合规）版权与免责声明：本文仅用于信息分享与交流，不构成任何形式的法律、投资、医疗或其他专业建议，也不构成对任何结果的承诺或保证。文中提及的商标、品牌、Logo、产品名称及相关图片/素材，其权利归各自合法权利人所有。本站内容可能基于公开资料整理，亦可能使用 AI 辅助生成或润色；我们尽力确保准确与合规，但不保证完整性、时效性与适用性，请读者自行甄别并以官方信息为准。若本文内容或素材涉嫌侵权、隐私不当或存在错误，请相关权利人/当事人联系本站，我们将及时核实并采取删除、修正或下架等处理措施。也请勿在评论或联系信息中提交身份证号、手机号、住址等个人敏感信息。 GEO（Generative Engine Optimization）专注于GEO（生成式引擎优化）技术的深度探索。分享前沿的AI搜索优化策略、实战案例与技术原理，助您在AI时代抢占流量先机。探索发现 → 首页 → 最新文章保持联系 📧 Email hyl162182@hotmail.com 📍 Location Guangdong, China © 2026 Geoz.com.cn. All rights reserved. 赣ICP备2026000942号隐私政策服务条款

OPENFABLE_LITELLM_MODEL

gpt-5.4

常见问题（FAQ）

OpenFable 与其他检索引擎相比有什么优势？

OpenFable 实现了 FABLE 方法，通过构建 LLM 增强的语义森林索引和自适应预算控制的双路径检索，相比传统方法能更精准地获取相关内容，且是开源项目。

如何快速开始使用 OpenFable？

设置 OPENAI_API_KEY 后运行 docker compose up -d，将 MCP 客户端连接到 http://localhost:8000/v1/mcp/sse 即可开始使用，支持 Claude Desktop、Cursor 等客户端。

OpenFable 的检索过程是如何工作的？

系统接收原始文本构建语义森林索引，通过双路径检索机制结合自适应预算控制来获取最相关内容，支持文档摄取和查询检索完整流程。

标签

生成式引擎优化结构化数据 llms.txt AI大模型人工智能

← 返回文章列表

分享到：微博

Tiger Data是什么公司？它的网站页尾技术分析（SVG、SEO与合规）

版权与免责声明：本文仅用于信息分享与交流，不构成任何形式的法律、投资、医疗或其他专业建议，也不构成对任何结果的承诺或保证。

文中提及的商标、品牌、Logo、产品名称及相关图片/素材，其权利归各自合法权利人所有。本站内容可能基于公开资料整理，亦可能使用 AI 辅助生成或润色；我们尽力确保准确与合规，但不保证完整性、时效性与适用性，请读者自行甄别并以官方信息为准。

若本文内容或素材涉嫌侵权、隐私不当或存在错误，请相关权利人/当事人联系本站，我们将及时核实并采取删除、修正或下架等处理措施。也请勿在评论或联系信息中提交身份证号、手机号、住址等个人敏感信息。

AI Summary (BLUF)