AI系统架构如何选择？从基础LLM到智能体的演进与简历筛选案例

Q: 开发RAG系统时，如何确保系统的可靠性？

关注架构的稳定性而非追求最高能力。通过向量数据库检索内部数据（如公司手册、历史简历），为LLM提供准确上下文，避免依赖模型的不确定性生成。

AI 智能体是当前的热门话题，但并非每个 AI 系统都需要成为智能体。 尽管智能体承诺了自主性和决策能力，但对于许多现实世界的用例而言，更简单、更具成本效益的解决方案往往更为合适。关键在于为手头的问题选择正确的架构。

AI 智能体是当前的热门话题，但并非每个 AI 系统都需要成为智能体。 尽管智能体承诺了自主性和决策能力，但对于许多现实世界的用例而言，更简单、更具成本效益的解决方案往往更为合适。关键在于为手头的问题选择正确的架构。

在本文中，我们将探讨大语言模型（LLM）的最新发展，并讨论 AI 系统的关键概念。

In this post, we will explore recent developments in Large Language Models (LLMs) and discuss key concepts of AI systems.

我们曾在不同复杂度的项目中应用 LLM，从零样本提示到思维链推理一种提示技术，要求模型展示其推理过程，而不仅仅是给出最终答案，有助于提高回答的逻辑性和准确性。，从基于 RAG 的架构到复杂的工作流和自主智能体。

We have worked with LLMs across projects of varying complexity, from zero-shot prompting to chain-of-thought reasoning, from RAG-based architectures to sophisticated workflows and autonomous agents.

这是一个新兴领域，术语仍在不断演变。不同概念之间的界限尚在定义中，分类也保持流动。随着领域的发展，新的框架和实践不断涌现，以构建更可靠的 AI 系统。

This is an emerging field with evolving terminology. The boundaries between different concepts are still being defined, and classifications remain fluid. As the field progresses, new frameworks and practices emerge to build more reliable AI systems.

为了演示这些不同的系统，我们将通过一个熟悉的用例——简历筛选应用——来揭示每个层级在能力（和复杂性）上的意外跃升。

To demonstrate these different systems, we will walk through a familiar use case – a resume-screening application – to reveal the unexpected leaps in capability (and complexity) at each level.

基础 LLM

一个基础 LLM 本质上是互联网的有损压缩，是其训练数据知识的一个快照。它擅长处理涉及这些存储知识的任务：总结小说、撰写关于全球变暖的文章、向 5 岁儿童解释狭义相对论，或者创作俳句。

A pure LLM is essentially a lossy compression of the internet, a snapshot of knowledge from its training data. It excels at tasks involving this stored knowledge: summarizing novels, writing essays about global warming, explaining special relativity to a 5-year-old, or composing haikus.

然而，在没有额外能力的情况下，LLM 无法提供实时信息，例如纽约市的当前温度。这将其与像 ChatGPT 这样的聊天应用区分开来，后者通过实时搜索和额外工具增强了其核心 LLM。

However, without additional capabilities, an LLM cannot provide real-time information like the current temperature in NYC. This distinguishes pure LLMs from chat applications like ChatGPT, which enhance their core LLM with real-time search and additional tools.

话虽如此，并非所有增强都需要外部上下文。有几种提示技术，包括上下文学习和少样本学习，可以帮助 LLM 处理特定问题，而无需上下文检索。

That said, not all enhancements require external context. There are several prompting techniques, including in-context learning and few-shot learning that help LLMs tackle specific problems without the need for context retrieval.

示例：

Example:

为了检查一份简历是否适合某个职位描述，可以利用具有单样本提示和上下文学习的 LLM 将其分类为“通过”或“未通过”。

To check if a resume is a good fit for a job description, an LLM with one-shot prompting and in-context learning can be utilized to classify it as Passed or Failed.

RAG（检索增强生成）

检索方法通过提供相关上下文来增强 LLM，使其更具时效性、精确性和实用性。您可以授予 LLM 访问内部数据的权限以进行处理和操作。这种上下文允许 LLM 提取信息、创建摘要并生成响应。RAG 还可以通过检索最新数据来整合实时信息。

Retrieval methods enhance LLMs by providing relevant context, making them more current, precise, and practical. You can grant LLMs access to internal data for processing and manipulation. This context allows the LLM to extract information, create summaries, and generate responses. RAG can also incorporate real-time information through the latest data retrieval.

示例：

Example:

简历筛选应用可以通过检索内部公司数据（如工程手册、政策和过往简历）来改进，以丰富上下文并做出更好的分类决策。

The resume screening application can be improved by retrieving internal company data, such as engineering playbooks, policies, and past resumes, to enrich the context and make better classification decisions.
检索通常使用向量化、向量数据库A database system designed to store and perform high-dimensional semantic similarity searches on vector embeddings of data.和语义搜索等工具。

Retrieval typically employs tools like vectorization, vector databases, and semantic search.

工具使用与 AI 工作流

LLM 可以通过遵循明确定义的路径来自动化业务流程。它们对于一致、结构良好的任务最为有效。

LLMs can automate business processes by following well-defined paths. They are most effective for consistent, well-structured tasks.

工具使用实现了工作流自动化。通过连接到 API（无论是计算器、日历、电子邮件服务还是搜索引擎），LLM 可以利用可靠的外部工具，而不是依赖其内部非确定性的能力。

Tool use enables workflow automation. By connecting to APIs, whether for calculators, calendars, email services, or search engines, LLMs can leverage reliable external utilities instead of relying on their internal, non-deterministic capabilities.

示例：

Example:

一个 AI 工作流可以连接到招聘门户以获取简历和职位描述 → 根据经验、教育背景和技能评估资格 → 发送适当的电子邮件回复（拒绝或面试邀请）。

An AI workflow can connect to the hiring portal to fetch resumes and job descriptions → Evaluate qualifications based on experience, education, and skills → Send appropriate email responses (rejection or interview invitation).
对于这个简历扫描工作流，LLM 需要访问数据库、电子邮件 API 和日历 API。它遵循预定义的步骤，以编程方式自动化整个过程。

For this resume scanning workflow, the LLM requires access to the database, email API, and calendar API. It follows predefined steps to automate the process programmatically.

AI 智能体

AI 智能体是能够独立推理和决策的系统。它们将任务分解为步骤，根据需要调用外部工具，评估结果，并决定后续行动：是存储结果、请求人工输入，还是进行下一步。

AI Agents are systems that reason and make decisions independently. They break down tasks into steps, use external tools as needed, evaluate results, and determine the following actions: whether to store results, request human input, or proceed to the next step.

这代表了在工具使用和 AI 工作流之上的另一层抽象，自动化了规划和决策过程。

This represents another layer of abstraction above tool use & AI workflow, automating both planning and decision-making.

AI 工作流需要明确的用户触发（如按钮点击）并遵循编程定义的路径，而 AI 智能体可以独立启动工作流，并动态决定其顺序和组合。

While AI workflows require explicit user triggers (like button clicks) and follow programmatically defined paths, AI Agents can initiate workflows independently and determine their sequence and combination dynamically.

示例：

Example:

一个 AI 智能体可以管理整个招聘流程，包括解析简历、通过聊天或电子邮件协调可用性、安排面试以及处理日程变更。

An AI Agent can manage the entire recruitment process, including parsing CVs, coordinating availability via chat or email, scheduling interviews, and handling schedule changes.
这项综合性任务要求 LLM 能够访问数据库、电子邮件和日历 API，以及聊天和通知系统。

This comprehensive task requires the LLM to access databases, email and calendar APIs, plus chat and notification systems.

核心要点

1. 并非每个系统都需要 AI 智能体

从简单、可组合的模式开始，根据需要增加复杂性。对于一些系统，仅检索就足够了。在我们的简历筛选示例中，当标准和行动明确时，一个简单的工作流就能很好地工作。只有当需要更大的自主性以减少人工干预时，才考虑采用智能体方法。

Start with simple, composable patterns and add complexity as needed. For some systems, retrieval alone suffices. In our resume screening example, a straightforward workflow works well when the criteria and actions are clear. Consider an Agent approach only when greater autonomy is needed to reduce human intervention.

2. 关注可靠性而非能力

LLM 的非确定性特性使得构建可靠的系统具有挑战性。虽然创建概念验证很快，但扩展到生产环境往往会暴露出复杂性。从一个沙盒环境开始，实施一致的测试方法，并为可靠性建立防护栏。

The non-deterministic nature of LLMs makes building dependable systems challenging. While creating proofs of concept is quick, scaling to production often reveals complications. Begin with a sandbox environment, implement consistent testing methods, and establish guardrails for reliability.

3. 架构对比与选型指南

为了更清晰地展示不同 AI 系统架构的差异，我们将其核心特征、适用场景和考量因素总结如下：


架构模式	核心能力	关键依赖/工具	适用场景	主要考量
基础 LLM	基于训练数据的知识生成与推理	提示工程（零样本/少样本）	创意写作、知识问答、文本摘要	知识截止、无法处理实时/专有数据
RAG	知识检索 + 上下文增强生成	向量数据库A database system designed to store and perform high-dimensional semantic similarity searches on vector embeddings of data.、嵌入模型、检索器	基于专有/最新文档的问答、个性化内容生成	检索质量、上下文窗口限制、数据更新延迟
工具使用与工作流	按预定流程调用外部 API/工具	各类外部 API、编排框架	自动化审批流、数据ETL、内容发布	流程的确定性与可预测性、错误处理
AI 智能体	自主规划、决策、工具调用与迭代	规划器、记忆模块、多种工具集	复杂项目管理、动态客户支持、多步骤研究分析	开发与调试复杂度高、可靠性保障难、成本较高

常见问题（FAQ）

RAG系统开发中，什么时候应该选择RAG而不是AI智能体An autonomous intelligent system that perceives its environment, makes decisions, and executes tasks, characterized by autonomy and adaptability.？

当应用场景需要可靠性和成本效益而非完全自主决策时，应选择RAG。如简历筛选这类结构化任务，RAG通过检索增强提供精确信息，比复杂智能体更实用稳定。

开发RAG系统时，如何确保系统的可靠性？

关注架构的稳定性而非追求最高能力。通过向量数据库A database system designed to store and perform high-dimensional semantic similarity searches on vector embeddings of data.检索内部数据（如公司手册、历史简历），为LLM提供准确上下文，避免依赖模型的不确定性生成。

RAG与基础LLM相比有哪些优势？

RAG通过检索实时或内部数据（如政策、过往案例）增强LLM，使其具备时效性和精确性。而基础LLM仅依赖训练数据快照，无法处理需最新信息的任务。

AI Summary (BLUF)

基础 LLM

RAG（检索增强生成）

工具使用与 AI 工作流

AI 智能体

核心要点

1. 并非每个系统都需要 AI 智能体

2. 关注可靠性而非能力

3. 架构对比与选型指南

常见问题（FAQ）

RAG系统开发中，什么时候应该选择RAG而不是AI智能体An autonomous intelligent system that perceives its environment, makes decisions, and executes tasks, characterized by autonomy and adaptability.？

开发RAG系统时，如何确保系统的可靠性？

RAG与基础LLM相比有哪些优势？