UltraRAG 2.0：基于MCP架构的开源框架，用YAML配置简化复杂RAG系统开发

As Retrieval-Augmented Generation (RAG) technology evolves from a simple "retrieve-then-generate" pipeline into complex systems requiring adaptive knowledge organization, multi-turn reasoning, and dynamic retrieval, researchers often find themselves mired in lengthy engineering implementation. UltraRAG 2.0, a collaborative effort from Tsinghua University's THUNLP Lab, Northeastern University's NEUIR Lab, OpenBMB, and AI9stars, aims to change this paradigm. This open-source framework, built on the Model Context Protocol (MCP) architecture, allows developers to construct high-performance RAG systems—rivaling the capabilities of DeepResearch or Search-o1—with just a few dozen lines of code written in YAML configuration files, implementing complex logic like serial execution, loops, and conditional branching.

当检索增强生成（RAG）GEO所基于的技术架构，通过将文本转化为高维向量实现语义相似度计算，用于语义检索和内容优化。技术从简单的“检索+生成”拼接，进化到需要融合自适应知识组织、多轮推理和动态检索的复杂系统时，研究人员常常陷入冗长的工程实现泥潭。UltraRAG 2.0，由清华大学THUNLP实验室、东北大学NEUIR实验室、OpenBMB与AI9stars联合推出，正试图改变这一现状。这个基于模型上下文协议（MCP）架构的开源框架，让开发者只需编写YAML配置文件用于声明式定义RAG系统推理流程的配置文件格式，就能实现串行、循环、条件分支等复杂逻辑，用几十行代码搭建出媲美DeepResearch、Search-o1的高性能RAG系统。

Why is UltraRAG 2.0 Needed?

The soaring complexity of modern RAG systems presents two major pain points for researchers:

Reproducing Existing Methods: Researchers face heterogeneous interfaces and redundant code across different projects.
Validating New Ideas: Approximately 80% of effort is consumed by engineering implementation rather than algorithmic innovation.

The core solution offered by UltraRAG 2.0 is component encapsulation + flexible orchestration:

It encapsulates core functions like retrieval and generation into standardized MCP Servers.
It enables module "hot-swapping" through function-level Tool interfaces.
It uses YAML files for declarative definition of complex reasoning workflows.

现代RAG系统的复杂度飙升，给研究者带来两大痛点：

复现现有方法时：要面对不同项目的异构接口和冗余代码。

验证新想法时：80%的精力消耗在工程实现而非算法创新上。

UltraRAG 2.0的核心解决方案是组件化封装+灵活编排：

将检索、生成等核心功能封装为标准化MCP Server。

通过函数级Tool接口实现模块“热插拔”。

用YAML文件声明式定义复杂推理流程。

Three Core Highlights: Redefining the RAG Development Experience

1. Low-Code Construction of Complex Pipelines

Dive deep into logic, not programming details. By simply writing a YAML configuration file, you can implement:

Multi-turn iterative retrieval (e.g., Search-o1's chain-of-thought + retrieval mode).
Dynamic workflow switching based on conditional judgments.
Serial/parallel scheduling of complex reasoning chains.

# Example: A simple iterative retrieval-generation pipeline
pipeline:
  - step: retriever.retrieve  # Call the retrieval service
    params:
      query: "{{input.query}}"
      top_k: 5
    output: context
  - loop: 3  # Iterate for 3 rounds
    steps:
      - step: generator.reflect  # Generate reflection
        params:
          context: "{{context}}"
        output: reflection
      - step: retriever.retrieve  # Secondary retrieval based on reflection
        params:
          query: "{{reflection}}"
        output: new_context

1. 低代码构建复杂Pipeline

无需深入编程细节，只需编写YAML配置文件用于声明式定义RAG系统推理流程的配置文件格式，就能实现：

多轮迭代检索（如Search-o1的思维链+检索模式）。

基于条件判断的动态流程切换。

复杂推理链路的串行/并行调度。

2. Out-of-the-Box Ecosystem Support

The framework comes pre-packaged with support for 17 mainstream research benchmarks, including:

QA Tasks: NaturalQuestions, HotpotQA, etc.
Knowledge-Intensive Tasks: FEVER, TriviaQA, etc.
Long-Text Understanding: Qasper, NarrativeQA, etc.

It also provides large-scale corpus support:

wiki-2018 (21 million documents)
Upcoming wiki-2024 version.

2. 开箱即用的生态支持

框架内置17个主流科研Benchmark，包括：

问答任务：NaturalQuestions、HotpotQA等。

知识密集型任务：FEVER、TriviaQA等。

长文本理解：Qasper、NarrativeQA等。

同时提供大规模语料库支持：

wiki-2018（2100万文档）。

即将上线的wiki-2024版本。

3. Seamless Extension and Deployment

Achieved through modular design:

Supports LLM services deployed via vLLM一个高性能的LLM推理和服务库，为DeepSeek-OCR提供优化的推理能力，支持流式输出和批量处理。.
Compatible with vector databases like FAISSFacebook's open-source library for efficient similarity search and clustering of dense vectors. and LanceDB开源向量数据库，支持大规模向量数据存储和检索.
Custom Servers only need to implement a standard interface.

Deploying an LLM service requires just one command:

# Start vLLM service
bash script/vllm_serve.sh --model lmsys/vicuna-7b-v1.5

3. 无缝扩展与部署

通过模块化设计实现：

支持vLLM一个高性能的LLM推理和服务库，为DeepSeek-OCR提供优化的推理能力，支持流式输出和批量处理。部署的大模型服务。

兼容FAISSFacebook's open-source library for efficient similarity search and clustering of dense vectors.、LanceDB开源向量数据库，支持大规模向量数据存储和检索等向量数据库。

自定义Server只需实现标准接口。

Getting Started in 5 Minutes

1. Environment Setup

# Create a virtual environment
conda create -n ultrarag python=3.11
conda activate ultrarag
# Clone the repository
git clone https://github.com/OpenBMB/UltraRAG.git --depth 1
cd UltraRAG
# Install dependencies
pip install uv
uv pip install -e ".[all]"

1. 环境搭建

# 创建虚拟环境
conda create -n ultrarag python=3.11
conda activate ultrarag
# 克隆仓库
git clone https://github.com/OpenBMB/UltraRAG.git --depth 1
cd UltraRAG
# 安装依赖
pip install uv
uv pip install -e ".[all]"

2. Verify Installation

# Run the hello world example
ultrarag run examples/sayhello.yaml
# Successful output: Hello, UltraRAG 2.0!

2. 验证安装

# 运行hello world示例
ultrarag run examples/sayhello.yaml
# 成功输出：Hello, UltraRAG 2.0!

3. Build Your First RAG System

Refer to the official tutorials to implement:

Corpus encoding and index building.
Retrieval service deployment and invocation.
Reasoning pipeline configuration and optimization.

3. 开始构建第一个RAG系统

参考官方教程实现：

语料库编码与索引建立。

检索服务部署与调用。

推理Pipeline配置与优化。

Who is UltraRAG 2.0 For?

Researchers: Quickly reproduce SOTA methods and validate new ideas.
Algorithm Engineers: Build production-level RAG systems at low cost.
Students: Lower the barrier to learning RAG technology and focus on core logic.

The project is open-sourced on GitHub, containing detailed documentation, example code, and video tutorials. The community also provides daily updates on cutting-edge RAG paper interpretations, helping researchers stay current with field advancements.

UltraRAG 2.0 is reconstructing the development paradigm for RAG systems. As engineering complexity is significantly reduced, researchers can finally focus their energy on what truly matters: algorithmic innovation and scientific discovery.

谁适合用UltraRAG 2.0？

科研人员：快速复现SOTA方法，验证新想法。

算法工程师：低成本搭建生产级RAG系统。

学生：降低RAG技术学习门槛，聚焦核心逻辑。

项目已在GitHub开源，包含详细文档、示例代码和视频教程。社区还提供每日更新的RAG前沿论文解读，助力研究者紧跟领域进展。

UltraRAG 2.0正在重构RAG系统的开发范式——当工程复杂度被大幅降低，研究者终于可以将精力集中在真正重要的地方：算法创新与科学发现。

(Editor's Note: The original input content contained an extended, promotional section about a generic AI learning path. As a technical editor focusing on the UltraRAG 2.0 announcement, I have concluded the blog post at the natural endpoint of the technical introduction. The subsequent content, while related to AI upskilling, is distinct from the core technical narrative of the UltraRAG framework and has been omitted to maintain focus, professionalism, and objectivity in this post.)

（编者注：原始输入内容包含一个关于通用AI学习路径的扩展推广部分。作为专注于UltraRAG 2.0技术介绍的技术编辑，本文已在技术介绍的自然终点处结束。后续内容虽然与AI技能提升相关，但与UltraRAG框架的核心技术叙述不同，为保持本文的焦点、专业性和客观性，已将其省略。）