Graph RAG：知识图谱如何突破大语言模型的局限

Introduction

Large Language Models (LLMs) have revolutionized how we interact with information, demonstrating remarkable capabilities in understanding and generating human-like text. However, their inherent design and training methodologies come with significant constraints that can impact their reliability and applicability in specialized or dynamic contexts. Concurrently, advanced techniques like Retrieval-Augmented Generation (RAG) and its evolution, Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。, have emerged to bridge some of these gaps by grounding LLMs in external, structured knowledge sources. This article explores the core limitations of contemporary LLMs and delves into the mechanics and implementation strategies of Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。, providing a technical overview of how knowledge graphs can enhance generative AI systems.

大语言模型（LLM）彻底改变了我们与信息交互的方式，在理解和生成类人文本方面展现出卓越的能力。然而，其固有的设计和训练方法存在显著的限制，这些限制会影响其在专业化或动态环境中的可靠性和适用性。与此同时，检索增强生成（RAG）及其演进形式 Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。等先进技术应运而生，通过将 LLM 建立在外部结构化知识源之上，来弥合其中的一些差距。本文探讨了当代 LLM 的核心局限性，并深入研究了 Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。的机制和实现策略，概述了知识图谱A structured knowledge base that represents entities and their relationships in a graph format.如何增强生成式 AI 系统。

Key Limitations of Large Language Models

While powerful, LLMs are not omniscient. Their performance is bounded by several fundamental architectural and data-related constraints.

尽管功能强大，但 LLM 并非全知全能。其性能受到几个基本的架构和数据相关限制的约束。

1. Insufficient Domain-Specific Information

LLMs are predominantly trained on vast, publicly available datasets. While this provides broad general knowledge, it inherently means they lack access to non-public, domain-specific, or proprietary information. Consequently, when queried about niche topics, internal company data, or recent unpublished research, an LLM may provide incomplete, generic, or inaccurate responses.

LLM 主要是在庞大的公开可用数据集上进行训练的。虽然这提供了广泛的通用知识，但从本质上讲，这意味着它们无法获取非公开的、特定领域的或专有信息。因此，当被问及小众主题、公司内部数据或近期未发表的研究时，LLM 可能会提供不完整、通用或不准确的回答。

2. Potential for Hallucination and Misinformation

LLMs generate responses by predicting the most probable sequence of words based on their training data. When faced with queries outside their training distribution or requiring factual accuracy beyond their knowledge cutoff, they may "hallucinate"—producing plausible-sounding but incorrect or entirely fabricated information. This is not an act of deception but a statistical artifact of their generative nature.

LLM 通过根据其训练数据预测最可能的词序列来生成回答。当面对超出其训练分布范围的查询，或需要其知识截止日期之后的事实准确性时，它们可能会产生“幻觉”——生成听起来合理但错误或完全捏造的信息。这不是欺骗行为，而是其生成性质的统计假象。

3. Inability to Access Real-Time Information

The training process for state-of-the-art LLMs is computationally intensive and occurs in discrete phases. As a result, their knowledge is static, frozen at the point of their last training update. They cannot access or integrate real-time data, news, or recent events, making them unsuitable for tasks requiring current information.

最先进 LLM 的训练过程计算密集，并且是分阶段进行的。因此，它们的知识是静态的，冻结在最后一次训练更新的时间点。它们无法访问或整合实时数据、新闻或近期事件，这使得它们不适合需要最新信息的任务。

4. Immutable Pre-Training Data

The foundational knowledge of an LLM is encoded during pre-training on a fixed corpus. If this corpus contains errors, biases, or outdated information, these flaws become embedded in the model's parameters and are extremely difficult to correct or "unlearn" post-deployment. The model will continue to generate responses based on this imperfect foundation.

LLM 的基础知识是在固定语料库的预训练过程中编码的。如果该语料库包含错误、偏见或过时信息，这些缺陷就会嵌入到模型的参数中，并且在部署后极难纠正或“忘却”。模型将继续基于这个不完美的基础生成回答。

5. Lack of True Long-Term Memory

LLMs are stateless in terms of conversation or task context across sessions. While they can maintain context within a single interaction window (limited by token count), they possess no persistent memory of past interactions. This makes them ill-equipped for complex, multi-session tasks that require building upon previously established understanding or context.

在跨会话的对话或任务上下文方面，LLM 是无状态的。虽然它们可以在单个交互窗口内（受令牌数量限制）保持上下文，但它们不具备对过去交互的持久记忆。这使得它们不适合需要基于先前建立的理解或上下文进行构建的复杂、多会话任务。

What is Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。?

Retrieval-Augmented Generation (RAG) is a framework that enhances the generation capabilities of LLMs by dynamically retrieving relevant information from an external knowledge source at inference time. Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。 is a specialized variant of RAG that utilizes a Knowledge Graph (KG) as its retrieval backbone.

检索增强生成（RAG）是一种框架，通过在推理时从外部知识源动态检索相关信息，来增强 LLM 的生成能力。Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。是 RAG 的一种专门变体，它利用知识图谱A structured knowledge base that represents entities and their relationships in a graph format.（KG）作为其检索主干。

A Knowledge Graph structures information as a network of entities (nodes) and the relationships (edges) between them. Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。 leverages this structured representation to perform more precise and contextually aware retrieval. It treats the knowledge graph as a vast, interconnected vocabulary where entities and relationships are fundamental units.

知识图谱A structured knowledge base that represents entities and their relationships in a graph format.将信息结构化为实体（节点）及其之间关系（边）的网络。Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。利用这种结构化表示来执行更精确和具有上下文感知能力的检索。它将知识图谱A structured knowledge base that represents entities and their relationships in a graph format.视为一个庞大的、相互关联的词汇表，其中实体和关系是基本单元。

The core idea is straightforward: for a user query, extract key entities, retrieve a relevant subgraph surrounding those entities, and use this structured context to ground the LLM's generation, leading to more accurate and factually consistent answers.

其核心思想很简单：针对用户查询，提取关键实体，检索围绕这些实体的相关子图，并利用这种结构化的上下文来支撑 LLM 的生成，从而产生更准确、事实一致性更强的答案。

A Simple Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。 Pipeline

The following pseudocode illustrates a fundamental Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。 workflow:

以下伪代码说明了一个基本的 Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。工作流程：

def simple_graph_rag(query_str, nebulagraph_store, llm):
    # Step 1: Extract entities from the query
    entities = _get_key_entities(query_str, llm)
    # Step 2: Retrieve a contextual subgraph based on entities
    graph_rag_context = _retrieve_subgraph_context(entities)
    # Step 3: Synthesize the final answer using the context
    return _synthesize_answer(query_str, graph_rag_context, llm)

Step 1: Entity Extraction
This phase identifies the core entities (people, places, concepts) within the user's query, often using the LLM itself or a dedicated NER model. Synonyms may be expanded to improve retrieval recall.

此阶段识别用户查询中的核心实体（人物、地点、概念），通常使用 LLM 本身或专用的命名实体识别（NER）模型。可以扩展同义词以提高检索召回率。

def _get_key_entities(query_str, llm=None, with_llm=True):
    # ... entity extraction logic ...
    return _expand_synonyms(entities)

Step 2: Subgraph Retrieval
Using the identified entities, the system queries the knowledge graph to fetch a surrounding subgraph, typically exploring relationships up to a specified depth (e.g., 2 hops). This subgraph contains the structured context relevant to the query.

利用识别出的实体，系统查询知识图谱A structured knowledge base that represents entities and their relationships in a graph format.以获取周围的子图，通常探索到指定深度（例如，2跳）的关系。该子图包含与查询相关的结构化上下文。

def _retrieve_subgraph_context(entities, depth=2, limit=30):
    # ... subgraph querying logic ...
    return nebulagraph_store.get_relations(entities, depth, limit)

Step 3: Answer Synthesis
The retrieved subgraph (often converted into a textual format) is fed into the LLM alongside the original query as augmented context. A carefully designed prompt instructs the LLM to generate an answer based primarily on the provided context.

检索到的子图（通常转换为文本格式）与原始查询一起作为增强上下文输入到 LLM 中。一个精心设计的提示词会指示 LLM 主要根据提供的上下文生成答案。

def _synthesize_answer(query_str, graph_rag_context, llm):
    return llm.predict(PROMPT_SYNTHESIZE_AND_REFINE, query_str, graph_rag_context)

Bottlenecks of Traditional RAG and The Graph Advantage

Traditional RAG often relies on vector similarity search over unstructured text chunks (e.g., document paragraphs). This approach has two main bottlenecks:

传统的 RAG 通常依赖于对非结构化文本块（例如，文档段落）的向量相似性搜索。这种方法有两个主要瓶颈：

Lack of Training Data / Understanding: The retriever may struggle to understand complex queries that require relational reasoning, as it operates on bag-of-words or semantic embeddings without explicit relational structure.
缺乏训练数据/理解能力：检索器可能难以理解需要关系推理的复杂查询，因为它基于词袋或语义嵌入进行操作，没有明确的关系结构。
Insufficient Textual Comprehension: Retrieving text chunks based solely on semantic similarity can miss crucial information if the phrasing differs, and it cannot perform multi-hop reasoning (e.g., "the author of the book that Company A invested in").
文本理解不足：如果措辞不同，仅基于语义相似性检索文本块可能会遗漏关键信息，并且无法执行多跳推理（例如，“A 公司投资的那本书的作者”）。

Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。 addresses these by using a Knowledge Graph. The explicit representation of entities and relationships allows for:

Precise Retrieval: Directly locating specific entities and their connected facts.
Relational Reasoning: Traversing edges to perform multi-hop queries and discover indirect connections.
Structured Context: Providing the LLM with clean, relational context rather than potentially noisy text passages.

Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。通过使用知识图谱A structured knowledge base that represents entities and their relationships in a graph format.来解决这些问题。实体和关系的显式表示允许：

精确检索：直接定位特定实体及其关联事实。

关系推理：遍历边以执行多跳查询并发现间接联系。

结构化上下文：为 LLM 提供清晰的关系上下文，而不是可能有噪声的文本段落。

Seven Methods for Graph Exploration with LlamaIndexA framework focused on data ingestion and retrieval for building RAG applications.

The LlamaIndexA framework focused on data ingestion and retrieval for building RAG applications. framework provides versatile tools for implementing Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。. Here we outline seven distinct query/exploration methods.

LlamaIndexA framework focused on data ingestion and retrieval for building RAG applications. 框架提供了用于实现 Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。的多功能工具。我们在此概述七种不同的查询/探索方法。

Method 1: KG Vector-Based Retrieval

query_engine = kg_index.as_query_engine()

This is the default, out-of-the-box method. It finds KG entities via vector similarity, retrieves connected text chunks, and optionally explores relationships. It's simple but effective for semantic searches.

这是默认的开箱即用方法。它通过向量相似性查找 KG 实体，检索连接的文本块，并可选择性地探索关系。它简单，但对于语义搜索有效。

这种方法通过向量相似性查找 KG 实体，获取连接的文本块，并选择性探索关系。是 LlamaIndexA framework focused on data ingestion and retrieval for building RAG applications. 基于索引构建的默认查询方式。它非常简单、开箱即用，不用额外的参数。

Method 2: KG Keyword-Based Retrieval

kg_keyword_query_engine = kg_index.as_query_engine(
    include_text=False,
    retriever_mode="keyword",
    response_mode="tree_summarize",
)

This method uses keyword matching (retriever_mode="keyword") to find relevant KG entities and their triplets. include_text=False means it uses only the raw triplets (subject, predicate, object). The response_mode="tree_summarize" is useful for generating concise summaries from the retrieved graph structure.

此方法使用关键词匹配（retriever_mode="keyword"）来查找相关的 KG 实体及其三元组。include_text=False 意味着它仅使用原始三元组（主语、谓语、宾语）。response_mode="tree_summarize" 有助于从检索到的图结构生成简洁的摘要。

这个查询用了关键词来检索相关的 KG 实体，来获取连接的文本块，并选择性地探索关系以获取更多的上下文。而参数 retriever_mode="keyword" 指定了本次检索采用关键词形式。include_text=False：查询引擎只用原生三元组进行查询，查询不包含对应节点的文本信息；response_mode="tree_summarize"：返回结果（响应形式）是知识图谱A structured knowledge base that represents entities and their relationships in a graph format.的树结构的总结。这个树以递归方式构建，查询作为根节点，最相关的答案作为叶节点。tree_summarize 响应模式对于总结性任务非常有用。

Method 3: KG Hybrid Retrieval

kg_hybrid_query_engine = kg_index.as_query_engine(
    include_text=True,
    response_mode="tree_summarize",
    embedding_mode="hybrid",
    similarity_top_k=3,
    explore_global_knowledge=True,
)

Setting embedding_mode="hybrid" combines vector and keyword search for retrieval, deduplicating results. explore_global_knowledge=True allows the search to consider the broader graph context beyond immediate neighbors, which is beneficial for complex queries.

设置 embedding_mode="hybrid" 结合了向量和关键词搜索进行检索，并对结果去重。explore_global_knowledge=True 允许搜索考虑直接邻居之外更广泛的图上下文，这对复杂查询有益。

通过设定 embedding_mode="hybrid"，指定查询引擎为基于向量的检索和基于关键词的检索二者的混合方式，从知识图谱A structured knowledge base that represents entities and their relationships in a graph format.中检索信息，并进行去重。KG 混合检索方式不仅使用关键词找到相关的三元组，它也使用基于向量的检索来找到基于语义相似性的相似三元组。所以，本质上，混合模式结合了关键词搜索和语义搜索，并利用这两种方法的优势来提高搜索结果的准确性和相关性。explore_global_knowledge=True：指定查询引擎是否要考虑知识图谱A structured knowledge base that represents entities and their relationships in a graph format.的全局上下文来检索信息。当设置 explore_global_knowledge=True时，查询引擎不会将其搜索限制在本地上下文（即，一个节点的直接邻居），而是会考虑知识图谱A structured knowledge base that represents entities and their relationships in a graph format.的更广泛的全局上下文。

Method 4: Native Vector Index Retrieval (Non-Graph Baseline)

vector_index = VectorStoreIndex.from_documents(documents)
vector_query_engine = vector_index.as_query_engine()

This approach bypasses the knowledge graph entirely, relying solely on a vector index over document chunks. It serves as a performance baseline for comparing pure vector search against graph-augmented methods.

这种方法完全绕过知识图谱A structured knowledge base that represents entities and their relationships in a graph format.，仅依赖于文档块的向量索引。它作为性能基线，用于比较纯向量搜索与图增强方法。

这种方式完全不处理知识图谱A structured knowledge base that represents entities and their relationships in a graph format.。它基于向量索引，会先构建文档的向量索引，再从向量索引构建向量查询引擎。

Method 5: Custom Composite Query Engine (KG + Vector)

This method involves creating a custom retriever that fetches results from both a KG retriever and a Vector retriever, then merges them (using a union OR or intersection AND logic). The provided CustomRetriever class in the input content demonstrates this, combining results from KGTableRetriever and VectorIndexRetriever.

此方法涉及创建一个自定义检索器，该检索器从 KG 检索器和向量检索器获取结果，然后合并它们（使用并集 OR 或交集 AND 逻辑）。输入内容中提供的 CustomRetriever 类演示了这一点，它结合了 KGTableRetriever 和 VectorIndexRetriever 的结果。

LlamaIndexA framework focused on data ingestion and retrieval for building RAG applications. 构建了一个 CustomRetriever。如上所示，你可以看到它的具体实现。它用来进行知识图谱A structured knowledge base that represents entities and their relationships in a graph format.搜索和向量搜索。默认的 mode OR 保证了两种搜索结果的并集，结果是包含了这两个搜索方式的结果，且进行了结果去重：从知识图谱A structured knowledge base that represents entities and their relationships in a graph format.搜索（KGTableRetriever）获得的细节；从向量索引搜索（VectorIndexRetriever）获得的语义相似性搜索的详情。

Method 6: KnowledgeGraphQueryEngine

query_engine = KnowledgeGraphQueryEngine(
    storage_context=storage_context,
    service_context=service_context,
    llm=llm,
    verbose=True,
)

This engine allows querying the knowledge graph using natural language. It uses an LLM to translate the natural language query into a structured query language (like Cypher for NebulaGraph) which is then executed on the KG. This abstracts away the need to learn a specific query language.

该引擎允许使用自然语言查询知识图谱A structured knowledge base that represents entities and their relationships in a graph format.。它使用 LLM 将自然语言查询转换为结构化查询语言（例如 NebulaGraph 的 Cypher），然后在 KG 上执行。这抽象掉了学习特定查询语言的需要。

KnowledgeGraphQueryEngine 是一个可让我们用自然语言查询知识图谱A structured knowledge base that represents entities and their relationships in a graph format.的查询引擎。它使用 LLM 生成 Cypher 查询语句，再在知识图谱A structured knowledge base that represents entities and their relationships in a graph format.上执行这些查询。这样，我们可以在不学习 Cypher 或任何其他查询语言的情况下查询知识图谱A structured knowledge base that represents entities and their relationships in a graph format.。

Method 7: KnowledgeGraphRAGRetriever

graph_rag_retriever = KnowledgeGraphRAGRetriever(
    storage_context=storage_context,
    service_context=service_context,
    llm=llm,
    verbose=True,
)
kg_rag_query_engine = RetrieverQueryEngine.from_args(graph_rag_retriever)

This is a dedicated retriever implementing the core Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。 pipeline described earlier. It automates the steps of entity extraction, subgraph retrieval (with configurable depth), and context preparation for a downstream LLM task within a RetrieverQueryEngine.

这是一个实现了前面描述的核心 Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。流程的专用检索器。它在 RetrieverQueryEngine 内自动化了实体提取使用大模型或其他方法从用户查询中识别关键实体，作为知识图谱检索的输入，通常包括同义词扩展以提高检索覆盖率。、子图检索从知识图谱中提取与查询实体相关的局部图结构，通常基于特定深度和限制条件，用于构建生成答案的上下文信息。（深度可配置）以及为下游 LLM 任务准备上下文的步骤。

KnowledgeGraphRAGRetriever 是 LlamaIndexA framework focused on data ingestion and retrieval for building RAG applications. 中的一个 RetrieverQueryEngine，它在知识图谱A structured knowledge base that represents entities and their relationships in a graph format.上执行 Graph RAG基于知识图谱的检索增强生成技术，通过图模型表示实体和关系，结合检索与语言生成技术，提升大模型在特定领域的回答准确性和上下文理解能力。查询。它接收一个问题或任务作为输入，并执行以下步骤：使用关键词在知识图谱A structured knowledge base that represents entities and their relationships in a graph format.中提取或 Embedding 搜索相关实体；从知识图谱A structured knowledge base that represents entities and their relationships in a graph format.中获取