如何实现语义文本检索协议来增强LLM聊天机器人体验？

Q: 这个语义文本检索协议的主要用途是什么？

该协议允许用户分享自己的文本内容，并让其他人能够搜索语义相似的文本。典型用例包括服务器端实现向量数据库进行查找，客户端可选择直接返回结果或将其集成到LLM提示中以增强聊天机器人体验。

Q: 完整演示和最小演示有什么区别？

完整演示需要ChatGPT API密钥，支持文本查找和将相似文本作为上下文与ChatGPT对话；最小演示无需API密钥，仅支持语义文本查找功能。两者都通过config.yaml文件配置参数，但完整演示多使用"temp"参数控制ChatGPT响应。

Introduction

This a protocol to enable users to offer writings of theirs so that others may search through them for semantically similar text. A typical use case envisioned is the server side implements a vectordb for text lookup, and the client side either returns the results as-is or feeds them into an LLM prompt for a chatbot experience. Strictly speaking though, neither use of a vectordb nor an LLM is required by the protocol.

这是一个协议，旨在使用户能够提供自己的文本，以便其他人可以搜索其中语义相似的文本。一个典型的设想用例是：服务器端实现一个用于文本查找的向量数据库A database system designed to store and perform high-dimensional semantic similarity searches on vector embeddings of data.，客户端可以原样返回结果，或者将其输入到LLM提示中以实现聊天机器人体验。然而，严格来说，该协议并不强制要求使用向量数据库A database system designed to store and perform high-dimensional semantic similarity searches on vector embeddings of data.或LLM。

Protocol Specification

Request Format

To use, the client sends the server JSON over HTTP(S) of the form:

{"query", <user query>,
 "max_results": <max number of results returned>}

使用时，客户端通过HTTP(S)向服务器发送以下格式的JSON：
{"query", <用户查询>,
 "max_results": <返回的最大结果数量>}

with additional fields as desired and receives a response of the form:

{"output": <list of outputs>}.

可以根据需要添加其他字段，并接收以下格式的响应：
{"output": <输出列表>}.

The demos in this repo illustrate using one additional field, max_cutoff.

本仓库中的演示示例说明了如何使用一个附加字段：max_cutoff。

Also while the demos use HTTP for simplicitly, all of this works over HTTPS.

此外，虽然演示为了简单起见使用了HTTP，但所有这些功能在HTTPS上同样有效。

Demo Implementations

Full Demo (ChatGPT APIOpenAI提供的ChatGPT模型编程接口，允许开发者集成对话功能。 Required)

This demo allows both looking up text as-is and talking to ChatGPT using similar text as context. This does require access to the ChatGPT APIOpenAI提供的ChatGPT模型编程接口，允许开发者集成对话功能。, and one must set environmental variables OPENAI_ORG_ID and OPENAI_API_KEY according to one's OpenAI account.

此演示允许直接查找文本，并使用相似的文本作为上下文与ChatGPT对话。这确实需要访问ChatGPT APIOpenAI提供的ChatGPT模型编程接口，允许开发者集成对话功能。，并且必须根据个人的OpenAI账户设置环境变量 OPENAI_ORG_ID 和 OPENAI_API_KEY。

In data, two example documents are provided - Aaron Swart'z Guerilla Open Access Manifesto and the US Declaration of Independence. However, you can replace them with whatever documents you like for semantic lookup.

在 data 目录中，提供了两个示例文档——Aaron Swartz的《游击队开放获取宣言》和美国的《独立宣言》。但是，您可以将它们替换为您喜欢的任何文档以进行语义查找。

In config.yaml, there are three parameters that may be tuned as desired (though the default values should be reasonably good for general purpose use):

在 config.yaml 文件中，有三个可以根据需要调整的参数（尽管默认值对于一般用途应该相当合适）：

temp - The temperature used for querying ChatGPT.

temp - 用于查询ChatGPT的温度参数。
max_results - The maximum number of similar paragraphs that may be returned.

max_results - 可能返回的相似段落的最大数量。
max_cutoff - The minimum embedding distance needed for a paragraph to possibly be returned.

max_cutoff - 一个段落可能被返回所需的最小嵌入距离向量空间中文本嵌入表示之间的相似性度量，距离越小表示语义越相似。。

In one terminal, run server.py to generate a local web server at 127.0.0.1:5000. When queried via HTTP, this returns as JSON relevant paragraphs from text files in the data directory.

在一个终端中，运行 server.py 以在 127.0.0.1:5000 生成一个本地Web服务器。当通过HTTP查询时，它会以JSON格式返回来自 data 目录中文本文件的相关段落。

In another terminal, run full_demo.py to be prompted for instructions.

在另一个终端中，运行 full_demo.py，系统将提示您输入指令。

Minimal Demo (ChatGPT APIOpenAI提供的ChatGPT模型编程接口，允许开发者集成对话功能。 Not Required)

This demo allows for looking up similar text.

此演示允许查找相似的文本。

Usage is similar to that of the full demo. Set parameters as desired in config.yaml (this time temp is unused). Run server.py to set up the server for querying text in the data directory, and run minimal_demo.py to start the demo.

使用方法与完整演示类似。在 config.yaml 中根据需要设置参数（这次 temp 参数未使用）。运行 server.py 来设置服务器以查询 data 目录中的文本，然后运行 minimal_demo.py 来启动演示。

Configuration Parameters Comparison

The behavior of the demos is controlled by parameters in config.yaml. The following table outlines the core parameters, their purpose, and their applicability across the two demo types.

演示的行为由 config.yaml 中的参数控制。下表概述了核心参数、它们的用途以及在两种演示类型中的适用性。


Parameter 参数	Description 描述	Purpose 用途	Full Demo 完整演示	Minimal Demo 最小演示
temp	Temperature for LLM generation. LLM生成的温度参数。	Controls randomness of ChatGPT's responses. 控制ChatGPT响应的随机性。	Required 必需	Unused 未使用
max_results	Maximum results to return. 返回的最大结果数。	Limits the number of similar text snippets retrieved. 限制检索到的相似文本片段的数量。	Used 使用	Used 使用
max_cutoff	Minimum similarity score. 最小相似度分数。	Filters out results below a certain similarity threshold. 过滤掉低于特定相似度阈值的结果。	Used 使用	Used 使用

Further Directions

Allowing additional input parameters gives this protocol greater flexibility.

允许额外的输入参数为该协议提供了更大的灵活性。

One example not implemented here is instead of just querying top level files in data, one can query inside nested subdirectories, and a file path could be sent to restrict to a particular subdirectory.

这里未实现的一个例子是：不仅可以查询 data 中的顶层文件，还可以查询嵌套子目录内部，并且可以发送文件路径以将查询限制在特定的子目录。

常见问题（FAQ）

这个语义文本检索协议的主要用途是什么？

该协议允许用户分享自己的文本内容，并让其他人能够搜索语义相似的文本。典型用例包括服务器端实现向量数据库A database system designed to store and perform high-dimensional semantic similarity searches on vector embeddings of data.进行查找，客户端可选择直接返回结果或将其集成到LLM提示中以增强聊天机器人体验。

如何使用这个协议进行查询？

客户端通过HTTP(S)向服务器发送JSON格式请求，包含"query"（用户查询）和"max_results"（返回结果最大数量）等字段。服务器会返回包含"output"列表的JSON响应，演示中还使用了"max_cutoff"参数控制返回结果的质量。

完整演示和最小演示有什么区别？

完整演示需要ChatGPT APIOpenAI提供的ChatGPT模型编程接口，允许开发者集成对话功能。密钥，支持文本查找和将相似文本作为上下文与ChatGPT对话；最小演示无需API密钥，仅支持语义文本查找功能。两者都通过config.yaml文件配置参数，但完整演示多使用"temp"参数控制ChatGPT响应。

AI Summary (BLUF)