如何快速构建动态知识图谱？Cognee替代传统RAG

还在为AI应用开发中的数据处理和知识管理烦恼吗？cognee作为一款专注于为AI应用和AI智能体（AI Agents）提供确定性输出的工具，能帮助你轻松构建动态知识图谱cognee通过分析文本数据自动构建的图结构，包含实体节点和它们之间的关系，能够动态更新和扩展。，替代传统RAG系统。本文将带你在10分钟内完成从环境搭建到运行第一个AI应用的全过程，让你快速掌握cognee的核心功能。

Struggling with data processing and knowledge management in AI application development? Cognee, a tool dedicated to providing deterministic outputs for AI applications and AI Agents, can help you effortlessly build dynamic knowledge graphs as an alternative to traditional RAG systems. This article will guide you through the entire process, from environment setup to running your first AI application, in just 10 minutes, allowing you to quickly grasp the core functionalities of Cognee.

什么是cognee？

cognee是一个开源项目，旨在为AI应用和AI智能体提供确定性的LLM输出。它通过构建可扩展、模块化的ECL（Extract, Cognify, Load）管道，为AI系统提供动态内存，能够互连各种类型的文档，如过往对话、文件、图像和音频转录本，有效替代传统的RAG系统。

Cognee is an open-source project designed to provide deterministic LLM outputs for AI applications and AI Agents. By constructing scalable, modular ECL (Extract, Cognify, Load) pipelines, it provides dynamic memory for AI systems, capable of interconnecting various types of documents such as past conversations, files, images, and audio transcripts, effectively replacing traditional RAG systems.

cognee的核心优势

cognee具有以下优势：

降低开发者的工作量和成本，同时提高输出质量和精度 (Reduce developer workload and costs while improving output quality and accuracy)
提供Pythonic数据管道cognee提供的符合Python编程习惯的数据处理流程，支持从多种数据源摄取数据，易于集成到现有的Python生态系统中。，支持从30多种数据源摄取数据 (Provides Pythonic data pipelines supporting ingestion from over 30 data sources)
高度可定制，支持自定义任务、管道和内置搜索端点 (Highly customizable, supporting custom tasks, pipelines, and built-in search endpoints)

环境准备

安装cognee

cognee支持Python 3.10到3.12版本，你可以使用pip、poetry、uv等任何Python包管理器进行安装。这里我们推荐使用uv，它能提供更快的安装速度。

Cognee supports Python versions 3.10 to 3.12. You can install it using any Python package manager like pip, poetry, or uv. We recommend using uv here for faster installation speeds.

首先安装uv（如果你的系统上没有的话）：

pip install uv

然后使用uv安装cognee：

uv pip install cognee

详细的安装说明可以参考官方文档。

Detailed installation instructions can be found in the official documentation.

配置环境变量

安装完成后，需要配置环境变量。首先设置LLM API密钥，以OpenAI为例：

import os
os.environ["LLM_API_KEY"] = "YOUR_OPENAI_API_KEY"

你也可以通过创建.env文件来设置变量，可参考项目提供的.env.template模板。如需使用其他LLM提供商，请查阅官方LLM提供商文档。

After installation, you need to configure environment variables. First, set the LLM API key, using OpenAI as an example. You can also set variables by creating a .env file, referencing the provided .env.template template. To use other LLM providers, please consult the official LLM provider documentation.

快速开始：第一个cognee应用

下面我们将通过一个简单的示例来演示cognee的基本用法。这个示例将展示如何添加文本数据，生成知识图谱，并进行查询。

Next, we will demonstrate the basic usage of Cognee through a simple example. This example will show how to add text data, generate a knowledge graph, and perform queries.

创建Python脚本

创建一个新的Python文件，例如simple_example.py，并输入以下代码：

Create a new Python file, for example simple_example.py, and enter the following code:

import cognee
import asyncio
from cognee.modules.search.utils import SearchType
import logging
from cognee.infrastructure.logging import setup_logging

async def main():
    # 重置cognee数据和系统状态，确保干净的开始
    print("重置cognee数据...")
    await cognee.prune.prune_data()
    await cognee.prune.prune_system(metadata=True)
    print("数据重置完成.\n")

    # 要添加到cognee的文本，cognee将基于此创建知识图谱
    text = """
    自然语言处理（NLP）是计算机科学和信息检索的一个跨学科子领域。
    """

    print("向cognee添加文本:")
    print(text.strip())
    # 添加文本，使其可用于cognify处理
    await cognee.add(text)
    print("文本添加成功.\n")

    print("运行cognify以创建知识图谱...\n")
    print("Cognify处理步骤:")
    print("1. 文档分类：确定输入文本的类型和类别。")
    print("2. 权限检查：确保用户拥有处理文本的必要权限。")
    print("3. 文本分块提取：将文本分解为句子或短语进行分析。")
    print("4. 添加数据点：存储提取的块用于处理。")
    print("5. 生成知识图谱：提取实体和关系以形成知识图谱。")
    print("6. 文本总结：创建内容的简明摘要以快速获取见解。\n")

    # 使用LLMs和cognee创建知识图谱
    await cognee.cognify()
    print("Cognify处理完成.\n")

    query_text = "告诉我关于NLP的信息"
    print(f"使用查询 '{query_text}' 搜索cognee以获取见解")
    # 查询cognee以获取关于添加文本的见解
    search_results = await cognee.search(query_type=SearchType.INSIGHTS, query_text=query_text)

    print("搜索结果:")
    # 显示结果
    for result_text in search_results:
        print(result_text)

if __name__ == "__main__":
    logger = setup_logging(log_level=logging.ERROR)
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
    try:
        loop.run_until_complete(main())
    finally:
        loop.run_until_complete(loop.shutdown_asyncgens())

import cognee
import asyncio
from cognee.modules.search.utils import SearchType
import logging
from cognee.infrastructure.logging import setup_logging

async def main():
    # Reset cognee data and system state for a clean start
    print("Resetting cognee data...")
    await cognee.prune.prune_data()
    await cognee.prune.prune_system(metadata=True)
    print("Data reset complete.\n")

    # Text to add to cognee, based on which cognee will create a knowledge graph
    text = """
    Natural Language Processing (NLP) is an interdisciplinary subfield of computer science and information retrieval.
    """

    print("Adding text to cognee:")
    print(text.strip())
    # Add text, making it available for cognify processing
    await cognee.add(text)
    print("Text added successfully.\n")

    print("Running cognify to create knowledge graph...\n")
    print("Cognify processing steps:")
    print("1. Document Classification: Determine the type and category of the input text.")
    print("2. Permission Check: Ensure the user has necessary permissions to process the text.")
    print("3. Text Chunk Extraction: Break down text into sentences or phrases for analysis.")
    print("4. Add Data Points: Store extracted chunks for processing.")
    print("5. Generate Knowledge Graph: Extract entities and relationships to form a knowledge graph.")
    print("6. Text Summarization: Create a concise summary of the content for quick insights.\n")

    # Create knowledge graph using LLMs and cognee
    await cognee.cognify()
    print("Cognify processing complete.\n")

    query_text = "Tell me about NLP"
    print(f"Searching cognee for insights using query '{query_text}'")
    # Query cognee for insights about the added text
    search_results = await cognee.search(query_type=SearchType.INSIGHTS, query_text=query_text)

    print("Search results:")
    # Display results
    for result_text in search_results:
        print(result_text)

if __name__ == "__main__":
    logger = setup_logging(log_level=logging.ERROR)
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
    try:
        loop.run_until_complete(main())
    finally:
        loop.run_until_complete(loop.shutdown_asyncgens())

运行脚本

保存文件后，在终端中运行：

python simple_example.py

运行后，你将看到类似以下的输出：

({'id': UUID('bc338a39-64d6-549a-acec-da60846dd90d'), 'updated_at': datetime.datetime(2024, 11, 21, 12, 23, 1, 211808, tzinfo=datetime.timezone.utc), 'name': '自然语言处理', 'description': '计算机科学和信息检索的一个跨学科子领域。'}, {'relationship_name': 'is_a_subfield_of', 'source_node_id': UUID('bc338a39-64d6-549a-acec-da60846dd90d'), 'target_node_id': UUID('6218dbab-eb6a-5759-a864-b3419755ffe0'), 'updated_at': datetime.datetime(2024, 11, 21, 12, 23, 15, 473137, tzinfo=datetime.timezone.utc)}, {'id': UUID('6218dbab-eb6a-5759-a864-b3419755ffe0'), 'updated_at': datetime.datetime(2024, 11, 21, 12, 23, 1, 211808, tzinfo=datetime.timezone.utc), 'name': '计算机科学', 'description': '研究计算和信息处理的学科。'})

这个输出展示了知识图谱中的节点和关系：

第一个元素是源节点 (例如，'自然语言处理') (The first element is the source node (e.g., 'Natural Language Processing'))
第二个元素是节点之间的关系 (例如，'is_a_subfield_of') (The second element is the relationship between nodes (e.g., 'is_a_subfield_of'))
第三个元素是目标节点 (例如，'计算机科学') (The third element is the target node (e.g., 'Computer Science'))

After saving the file, run it in the terminal. After execution, you will see output similar to the following. This output demonstrates the nodes and relationships within the knowledge graph.

使用UI可视化知识图谱

cognee提供了UI界面，可以帮助你更直观地查看和交互知识图谱。下面我们将演示如何启动UI并可视化我们刚刚创建的知识图谱。

Cognee provides a UI interface to help you view and interact with the knowledge graph more intuitively. Next, we will demonstrate how to launch the UI and visualize the knowledge graph we just created.

创建UI启动脚本

创建一个名为start_ui_example.py的文件，内容如下：

Create a file named start_ui_example.py with the following content:

#!/usr/bin/env python3
"""
演示如何使用cognee.start_ui()启动前端界面。

这展示了新的UI功能，其工作方式类似于DuckDB的start_ui()。
"""

import asyncio
import cognee
import time

async def main():
    # 首先，让我们向cognee添加一些数据以供UI显示
    print("向cognee添加示例数据...")
    await cognee.add(
        "自然语言处理（NLP）是计算机科学和信息检索的一个跨学科子领域。"
    )
    await cognee.add(
        "机器学习（ML）是人工智能的一个子集，专注于算法和统计模型。"
    )

    # 生成知识图谱
    print("生成知识图谱...")
    await cognee.cognify()

    print("\n" + "=" * 60)
    print("启动cognee UI...")
    print("=" * 60)

    # 启动UI服务器
    server = cognee.start_ui(
        host="localhost",
        port=3000,
        open_browser=True,  # 这将自动打开你的浏览器
    )

    if server:
        print("UI服务器启动成功!")
        print("界面将在以下地址可用: http://localhost:3000")
        print("\n完成后按Ctrl+C停止服务器...")

        try:
            # 保持服务器运行
            while server.poll() is None:  # 当进程仍在运行时
                time.sleep(1)
        except KeyboardInterrupt:
            print("\n停止UI服务器...")
            server.terminate()
            server.wait()  # 等待进程完成
            print("UI服务器已停止。")
    else:
        print("启动UI服务器失败。检查上面的日志以获取详细信息。")

if __name__ == "__main__":
    asyncio.run(main())

#!/usr/bin/env python3
"""
Demonstrates how to launch the frontend interface using cognee.start_ui().

This showcases the new UI functionality, which works similarly to DuckDB's start_ui().
"""

import asyncio
import cognee
import time

async def main():
    # First, let's add some data to cognee for the UI to display
    print("Adding sample data to cognee...")
    await cognee.add(
        "Natural Language Processing (NLP) is an interdisciplinary subfield of computer science and information retrieval."
    )
    await cognee.add(
        "Machine Learning (ML) is a subset of artificial intelligence focused on algorithms and statistical models."
    )

    # Generate knowledge graph
    print("Generating knowledge graph...")
    await cognee.cognify()

    print("\n" + "=" * 60)
    print("Launching cognee UI...")
    print("=" * 60)

    # Launch UI server
    server = cognee.start_ui(
        host="localhost",
        port=3000,
        open_browser=True,  # This will automatically open your browser
    )

    if server:
        print("UI server started successfully!")
        print("Interface will be available at: http://localhost:3000")
        print("\nPress Ctrl+C to stop the server when done...")

        try:
            # Keep the server running
            while server.poll() is None:  # While the process is still running
                time.sleep(1)
        except KeyboardInterrupt:
            print("\nStopping UI server...")
            server.terminate()
            server.wait()  # Wait for the process to finish
            print("UI server stopped.")
    else:
        print("Failed to start UI server. Check the logs above for details.")

if __name__ == "__main__":
    asyncio.run(main())

运行UI脚本

在终端中运行：

python start_ui_example.py

脚本将自动打开浏览器，显示cognee的UI界面。你可以在界面中查看和交互知识图谱，探索实体之间的关系。

Run the script in the terminal. The script will automatically open your browser, displaying the Cognee UI interface. You can view and interact with the knowledge graph in the interface, exploring relationships between entities.

使用Starter Kit快速上手

如果你想更快地开始使用cognee，可以使用cognee-starter-kit。这个仓库提供了结构化的数据集和预构建的数据管道，帮助你在几分钟内完成数据摄取、处理和可视化。

If you want to get started with Cognee even faster, you can use the cognee-starter-kit. This repository provides structured datasets and pre-built data pipelines to help you complete data ingestion, processing, and visualization in minutes.

获取Starter Kit

Starter Kit位于项目的cognee-starter-kit目录下。你可以直接使用这个目录中的代码和数据。

The Starter Kit is located in the cognee-starter-kit directory of the project. You can directly use the code and data in this directory.

安装依赖

进入cognee-starter-kit目录，运行以下命令安装依赖：

uv sync

运行示例管道

Starter Kit提供了多个示例管道，你可以根据需要选择运行：

The Starter Kit provides several example pipelines that you can choose to run based on your needs:

运行默认管道 (Run the default pipeline)
```
python src/pipelines/default.py
```
This script runs the cognify pipeline with default settings. It ingests text data, builds a knowledge graph, and allows you to run search queries.
运行低级管道 (Run the low-level pipeline)
```
python src/pipelines/low_level.py
```
This script implements a pipeline with custom ingestion tasks. It processes given JSON data about companies and employees, making it searchable via the graph.
运行自定义模型管道 (Run the custom model pipeline)
```
python src/pipelines/custom-model.py
```
The custom model uses custom pydantic models for graph extraction. This script serves as an example for classifying programming languages and visualizing relationships.

图预览

cognee提供了visualize_graph函数，可以为你渲染图形。以下是使用示例：

Cognee provides a visualize_graph function that can render the graph for you. Here is an example of its usage:

graph_file_path = str(
    pathlib.Path(
        os.path.join(pathlib.Path(__file__).parent, ".artifacts/graph_visualization.html")
    ).resolve()
)
await visualize_graph(graph_file_path)

运行后，你将得到一个HTML文件，可以在浏览器中打开查看知识图谱的可视化结果。

After running, you will get an HTML file that you can open in a browser to view the visualization of the knowledge graph.

总结与下一步

通过本文的介绍，你已经了解了cognee的基本概念和使用方法，并成功搭建并运行了你的