静态RAG和动态RAG哪个更适合我的项目？（附技术对比与代码实践）

Static RAG vs. Dynamic RAG: Core Principles, Technical Comparison, and Practical Guide

本文系统介绍静态 RAG 与动态 RAG 的核心原理、技术对比、主流实现方案及代码实践，适合技术选型和深入学习参考。

This article systematically introduces the core principles, technical comparisons, mainstream implementation approaches, and code practices of Static RAG and Dynamic RAG, suitable for technical selection and in-depth learning reference.

一、RAG 技术概述
二、静态 RAG
- 2.1 核心原理
- 2.2 优化技术
- 2.3 主流实践方案
- 2.4 代码示例
三、动态 RAG
- 3.1 核心原理
- 3.2 主流实现方案
四、Self-RAG 详解
- 4.1 核心原理
- 4.2 反思令牌机制
- 4.3 环境配置
- 4.4 代码实现
五、CRAG一种动态RAG实现方案，在工作流中检索后评估文档质量，并支持网络搜索作为兜底方案。详解
- 5.1 核心原理
- 5.2 环境配置
- 5.3 完整代码实现（LangGraph）
六、RAGFlow 平台
- 6.1 平台定位
- 6.2 Agent 工作流机制
- 6.3 SDK 使用
七、技术对比与选型建议
- 7.1 静态 vs 动态 RAG
- 7.2 Self-RAG vs CRAG一种动态RAG实现方案，在工作流中检索后评估文档质量，并支持网络搜索作为兜底方案。
- 7.3 选型建议
八、参考资源

Table of Contents

I. Overview of RAG Technology

II. Static RAG

2.1 Core Principles

2.2 Optimization Techniques

2.3 Mainstream Implementation Approaches

2.4 Code Examples

III. Dynamic RAG

3.1 Core Principles

3.2 Mainstream Implementation Approaches

IV. Self-RAG Deep Dive

4.1 Core Principles

4.2 Reflection Token Mechanism

4.3 Environment Setup

4.4 Code Implementation

V. CRAG一种动态RAG实现方案，在工作流中检索后评估文档质量，并支持网络搜索作为兜底方案。 Deep Dive

5.1 Core Principles

5.2 Environment Setup

5.3 Complete Code Implementation (LangGraph)

VI. RAGFlow Platform

6.1 Platform Positioning

6.2 Agent Workflow Mechanism

6.3 SDK Usage

VII. Technical Comparison and Selection Advice

7.1 Static vs. Dynamic RAG

7.2 Self-RAG vs. CRAG一种动态RAG实现方案，在工作流中检索后评估文档质量，并支持网络搜索作为兜底方案。

7.3 Selection Advice

VIII. Reference Resources

一、RAG 技术概述

RAG（Retrieval-Augmented Generation，检索增强生成）是一种结合信息检索与文本生成的技术架构，通过从外部知识库检索相关信息来增强大语言模型的生成能力。

RAG (Retrieval-Augmented Generation) is a technical architecture that combines information retrieval with text generation, enhancing the generative capabilities of large language models by retrieving relevant information from external knowledge bases.

为什么需要 RAG？

Why Do We Need RAG?


挑战	RAG 的解决方案
LLM 知识截止日期	检索最新的外部知识
幻觉问题	基于检索到的事实生成
领域知识不足	接入专业知识库
私有数据访问	检索企业内部文档

Challenge RAG's Solution

LLM Knowledge Cutoff Retrieve the latest external knowledge

Hallucination Problem Generate based on retrieved facts

Insufficient Domain Knowledge Connect to professional knowledge bases

Private Data Access Retrieve internal corporate documents

RAG 基础流程

RAG Basic Workflow
用户问题 → 向量化 → 相似度检索 → 获取相关文档 → 构建 Prompt → LLM 生成 → 返回答案
User Query → Vectorization → Similarity Search → Retrieve Relevant Documents → Construct Prompt → LLM Generation → Return Answer

二、静态 RAG

II. Static RAG

2.1 核心原理

2.1 Core Principles

静态 RAG 是传统的检索增强生成方法，采用**「一次检索、一次生成」**的线性流程：

Static RAG is the traditional retrieval-augmented generation method, employing a linear workflow of "one-time retrieval, one-time generation":
┌─────────────────────────────────────────────────────────────┐
│                      静态 RAG 流程                           │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌──────┐    ┌──────┐    ┌──────┐    ┌──────┐    ┌──────┐  │
│  │ 用户 │ →  │ 向量化│ →  │ 检索 │ →  │ 拼接 │ →  │ 生成 │  │
│  │ 问题 │    │ Query │    │ Top-K│    │Prompt│    │ 答案 │  │
│  └──────┘    └──────┘    └──────┘    └──────┘    └──────┘  │
│                                                             │
│  特点：流程固定，检索一次，上下文不再更新                      │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│                      Static RAG Workflow                    │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌──────┐    ┌──────┐    ┌──────┐    ┌──────┐    ┌──────┐  │
│  │ User │ →  │Vector│ →  │Retrie│ →  │Prompt│ →  │Genera│  │
│  │Query │    │ Query│    │ Top-K│    │Concat│    │Answer│  │
│  └──────┘    └──────┘    └──────┘    └──────┘    └──────┘  │
│                                                             │
│  Feature: Fixed process, single retrieval, context not updated│
└─────────────────────────────────────────────────────────────┘
「核心特点：」

Core Characteristics:

预处理阶段：文档离线分块、向量化、存入向量数据库A database system designed to store and perform high-dimensional semantic similarity searches on vector embeddings of data.

检索一次性：用户查询时一次性检索相关文档

固定上下文：检索到的内容直接拼接到 prompt，不再更新

流程固定：查询 → 检索 → 生成，线性执行

Preprocessing Phase: Document offline chunking, vectorization, storage in vector database.

One-time Retrieval: Retrieve relevant documents once upon user query.

Fixed Context: Retrieved content is directly concatenated into the prompt and not updated.

Fixed Process: Query → Retrieve → Generate, executed linearly.

2.2 优化技术

2.2 Optimization Techniques

虽然是静态流程，但可以通过多种技术优化检索和生成质量：

Although it's a static process, retrieval and generation quality can be optimized through various techniques:

技术原理适用场景

「HyDE」 先让 LLM 生成假设答案，用假设答案去检索问题表述模糊时

「Query Expansion」 扩展用户查询为多个变体，提升召回率提高召回率

「Reranker」 检索后用交叉编码器重排序提高精确度

「Sentence Window」 检索小块，返回时扩展上下文窗口需要更多上下文

「Parent Document」 检索小块，返回其父文档保持文档完整性

「Fusion RAG」 多路检索结果融合（RRF 算法）多维度召回

「Hybrid Search」 向量检索 + BM25 关键词检索混合兼顾语义和关键词
Technique Principle Applicable Scenario

「HyDE」 First let LLM generate a hypothetical answer, then use it for retrieval When the query is ambiguously phrased

「Query Expansion」 Expand user query into multiple variants to improve recall Improving recall rate

「Reranker」 Re-rank retrieved results using a cross-encoder Improving precision

「Sentence Window」 Retrieve small chunks, expand context window when returning Requiring more context

「Parent Document」 Retrieve small chunks, return their parent document Maintaining document integrity

「Fusion RAG」 Fuse multi-path retrieval results (RRF algorithm) Multi-dimensional recall

「Hybrid Search」 Combine vector search + BM25 keyword search Balancing semantics and keywords

2.3 主流实践方案

2.3 Mainstream Implementation Approaches

开源框架

Open Source Frameworks

「LangChain」：最流行的 RAG 框架，生态丰富

「LlamaIndex」：专注于数据索引和查询，提供多种索引类型

「Haystack」：模块化的 NLP 管道框架

「LangChain」: The most popular RAG framework with a rich ecosystem.

「LlamaIndex」: Focuses on data indexing and querying, offering various index types.

「Haystack」: A modular NLP pipeline framework.

向量数据库A database system designed to store and perform high-dimensional semantic similarity searches on vector embeddings of data.

Vector Databases

「Milvus」：高性能分布式向量数据库A database system designed to store and perform high-dimensional semantic similarity searches on vector embeddings of data.

「Chroma」：轻量级嵌入式向量数据库A database system designed to store and perform high-dimensional semantic similarity searches on vector embeddings of data.

「Pinecone」：全托管云向量数据库A database system designed to store and perform high-dimensional semantic similarity searches on vector embeddings of data.

「Weaviate」：支持多模态的向量数据库A database system designed to store and perform high-dimensional semantic similarity searches on vector embeddings of data.

「Qdrant」：Rust 编写的高性能向量数据库A database system designed to store and perform high-dimensional semantic similarity searches on vector embeddings of data.

「Milvus」: High-performance distributed vector database.

「Chroma」: Lightweight embedded vector database.

「Pinecone」: Fully-managed cloud vector database.

「Weaviate」: Vector database supporting multimodal data.

「Qdrant」: High-performance vector database written in Rust.

企业级产品

Enterprise Products

Azure AI Search + OpenAI

Amazon Bedrock Knowledge Bases

Google Vertex AI Search

Azure AI Search + OpenAI

Amazon Bedrock Knowledge Bases

Google Vertex AI Search

2.4 代码示例

2.4 Code Examples

基础静态 RAG（LangChain）

Basic Static RAG (LangChain)
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# 1. 加载文档
loader = WebBaseLoader("https://example.com/document")
documents = loader.load()

# 2. 分块
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50
)
splits = text_splitter.split_documents(documents)

# 3. 创建向量库
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(
    documents=splits,
    embedding=embeddings,
    persist_directory="./chroma_db"
)

# 4. 创建检索器
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 4}
)

# 5. 定义 Prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", "基于以下上下文回答问题。如果不知道答案，请说不知道。\n\n上下文：{context}"),
    ("human", "{question}")
])

# 6. 构建 Chain
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

def format_docs(docs):
    return"\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# 7. 查询
answer = rag_chain.invoke("你的问题")
print(answer)
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# 1. Load Documents
loader = WebBaseLoader("https://example.com/document")
documents = loader.load()

# 2. Chunking
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50
)
splits = text_splitter.split_documents(documents)

# 3. Create Vector Store
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(
    documents=splits,
    embedding=embeddings,
    persist_directory="./chroma_db"
)

# 4. Create Retriever
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 4}
)

# 5. Define Prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", "Answer the question based on the following context. If you don't know the answer, say so.\n\nContext: {context}"),
    ("human", "{question}")
])

# 6. Build Chain
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

def format_docs(docs):
    return"\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# 7. Query
answer = rag_chain.invoke("Your question")
print(answer)
带 Reranker 的静态 RAG

Static RAG with Reranker
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import CrossEncoderReranker
from langchain_community.cross_encoders import HuggingFaceCrossEncoder

# 创建 Reranker
model = HuggingFaceCrossEncoder(model_name="BAAI/bge-reranker-base")
compressor = CrossEncoderReranker(model=model, top_n=3)

# 包装检索器
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor,
    base_retriever=retriever
)

# 使用带重排序的检索器
rag_chain = (
    {"context": compression_retriever | format_docs, "question": Runnable

## 常见问题（FAQ）

### 静态RAG和动态RAG的核心区别是什么？

静态RAG采用‘一次检索、一次生成’的线性流程，而动态RAG（如Self-RAG、CRAG）在生成过程中能动态检索、评估和调整，适应性更强。

### 如何为我的项目选择静态RAG或动态RAG？

根据文章技术对比部分，若需求简单、文档稳定可选静态RAG；若需处理复杂查询、实时更新或高准确性，动态RAG（如Self-RAG、CRAG）更合适。

### RAGFlow平台在RAG实现中有什么优势？

RAGFlow提供集成的Agent工作流机制和SDK，简化了动态RAG的部署与管理，适合需要自动化、可扩展解决方案的企业场景。

静态RAG和动态RAG哪个更适合我的项目？（附技术对比与代码实践）

AI Summary (BLUF)

目录

一、RAG 技术概述

为什么需要 RAG？

RAG 基础流程

二、静态 RAG

2.1 核心原理

2.2 优化技术

2.3 主流实践方案

开源框架

向量数据库A database system designed to store and perform high-dimensional semantic similarity searches on vector embeddings of data.

企业级产品

2.4 代码示例

基础静态 RAG（LangChain）

带 Reranker 的静态 RAG


Challenge	RAG's Solution
LLM Knowledge Cutoff	Retrieve the latest external knowledge
Hallucination Problem	Generate based on retrieved facts
Insufficient Domain Knowledge	Connect to professional knowledge bases
Private Data Access	Retrieve internal corporate documents


技术	原理	适用场景
「HyDE」	先让 LLM 生成假设答案，用假设答案去检索	问题表述模糊时
「Query Expansion」	扩展用户查询为多个变体，提升召回率	提高召回率
「Reranker」	检索后用交叉编码器重排序	提高精确度
「Sentence Window」	检索小块，返回时扩展上下文窗口	需要更多上下文
「Parent Document」	检索小块，返回其父文档	保持文档完整性
「Fusion RAG」	多路检索结果融合（RRF 算法）	多维度召回
「Hybrid Search」	向量检索 + BM25 关键词检索混合	兼顾语义和关键词


Technique	Principle	Applicable Scenario
「HyDE」	First let LLM generate a hypothetical answer, then use it for retrieval	When the query is ambiguously phrased
「Query Expansion」	Expand user query into multiple variants to improve recall	Improving recall rate
「Reranker」	Re-rank retrieved results using a cross-encoder	Improving precision
「Sentence Window」	Retrieve small chunks, expand context window when returning	Requiring more context
「Parent Document」	Retrieve small chunks, return their parent document	Maintaining document integrity
「Fusion RAG」	Fuse multi-path retrieval results (RRF algorithm)	Multi-dimensional recall
「Hybrid Search」	Combine vector search + BM25 keyword search	Balancing semantics and keywords