AI代理开发：框架vs纯代码？2026年深度对比分析

Introduction

The landscape of AI agent development is rapidly evolving, with frameworks like LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models. and LlamaIndex WorkflowsLlamaIndex提供的工作流框架，用于构建和管理AI代理的复杂流程，提供高级抽象来简化代理开发。 gaining significant attention. This article presents a practical, hands-on comparison between building agents using these popular frameworks versus implementing a custom solution from scratch. Based on a real-world experiment, we aim to cut through the hype and provide developers with actionable insights into the trade-offs involved in each approach.

AI 智能体开发领域正在快速发展，像 LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models. 和 LlamaIndex WorkflowsLlamaIndex提供的工作流框架，用于构建和管理AI代理的复杂流程，提供高级抽象来简化代理开发。这样的框架获得了大量关注。本文基于一个真实的实验，对使用这些流行框架构建智能体与从头开始实现自定义解决方案进行了实践性的比较。我们的目标是穿透营销宣传，为开发者提供关于每种方法所涉及权衡的可操作见解。

Methodology and Context

To conduct this comparison, we took a straightforward, production-tested agent architecture—a single-tier LLM router—and implemented it in three ways: using pure Python code, LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models., and LlamaIndex WorkflowsLlamaIndex提供的工作流框架，用于构建和管理AI代理的复杂流程，提供高级抽象来简化代理开发。. Our team has been running a Co-pilot agent in production for approximately eight months and has assisted numerous clients with their agent deployments, giving us a broad perspective on real-world challenges.

为了进行这项比较，我们采用了一个经过生产环境测试的、简单的智能体架构——单层 LLM 路由器——并用三种方式实现了它：使用纯 Python 代码、LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models. 和 LlamaIndex WorkflowsLlamaIndex提供的工作流框架，用于构建和管理AI代理的复杂流程，提供高级抽象来简化代理开发。。我们的团队已经在生产环境中运行一个 Co-pilot 智能体大约八个月，并协助了许多客户进行智能体部署，这使我们能够广泛地了解现实世界中的挑战。

The tested architecture involves a single LLM that uses function calling to route tasks or "skills." These skills may themselves involve additional LLM calls before returning control to the main router. This pattern is simple yet versatile, commonly seen in various client implementations.

被测试的架构涉及一个使用函数调用来路由任务或“技能”的单一 LLM。这些技能本身在将控制权返回给主路由器之前，可能涉及额外的 LLM 调用。这种模式简单而通用，在各种客户实现中都很常见。

Key Findings and "Hot Takes"

Our experiment yielded several critical observations, which we present as our core "hot takes":

我们的实验得出了几个关键观察结果，我们将其作为核心的“直言不讳的观点”呈现：

Hot Take #1: Framework Abstractions Can Add Complexity for Experienced Developers

For developers with substantial experience in building agentic systems, the abstractions introduced by frameworks like LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models. and LlamaIndex WorkflowsLlamaIndex提供的工作流框架，用于构建和管理AI代理的复杂流程，提供高级抽象来简化代理开发。 can sometimes feel like unnecessary overhead. The learning curve associated with understanding a framework's specific paradigms and APIs may outweigh the benefits for a well-understood, simple architecture. Writing custom code provides maximum transparency and control, which is invaluable during debugging and optimization.

对于在构建智能体系统方面拥有丰富经验的开发者来说，像 LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models. 和 LlamaIndex WorkflowsLlamaIndex提供的工作流框架，用于构建和管理AI代理的复杂流程，提供高级抽象来简化代理开发。这样的框架引入的抽象有时会感觉像是不必要的开销。对于已经理解透彻的简单架构，学习理解框架特定范式和 API 的曲线可能超过其带来的好处。编写自定义代码提供了最大的透明度和控制力，这在调试和优化过程中是无价的。

Hot Take #2: Built-in Parallelism Complicates Debugging

While frameworks often promote features like built-in parallel execution as a major advantage, our experience suggests this can significantly complicate the debugging process. Tracing the flow of execution and understanding state changes becomes more challenging when operations are concurrent. In a custom-coded solution, developers have fine-grained control over concurrency, allowing them to introduce parallelism deliberately and debug it incrementally.

虽然框架经常将内置并行执行等功能作为主要优势进行宣传，但我们的经验表明，这可能会显著增加调试过程的复杂性。当操作并发执行时，跟踪执行流程和理解状态变化变得更加困难。在自定义编码的解决方案中，开发者对并发性拥有细粒度的控制，允许他们有意识地引入并行性并逐步进行调试。

Hot Take #3: Frameworks Offer Valuable Structure for Less Experienced Teams

Conversely, in environments with less experienced development teams or where there is no existing scaffolding for agent development, these frameworks can provide crucial structure and best practices. They offer a faster path to a Proof of Concept (POC) by handling common boilerplate code, state management, and providing a conceptual model for structuring agent logic. The key is to match the tool to the team's expertise and the project's phase.

相反，在开发团队经验不足或没有现有智能体开发脚手架的环境中，这些框架可以提供至关重要的结构和最佳实践。它们通过处理常见的样板代码、状态管理以及提供构建智能体逻辑的概念模型，为概念验证提供了一条更快的路径。关键在于使工具与团队的专业知识和项目阶段相匹配。

Framework-Specific Observations

LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models. Implementation Notes

During our LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models. implementation, we encountered specific nuances. A community member pointed out that LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models.'s nodes are essentially functions, allowing for custom tool-handling logic beyond the provided ToolNode abstraction. However, to integrate tools with a model using bind_tools, functions typically require the @tool decorator, which introduced challenges with parameters like self that aren't supplied by the LLM but are required by Pydantic validation.

在我们的 LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models. 实现过程中，我们遇到了一些具体的细微差别。一位社区成员指出，LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models. 的节点本质上是函数，允许超越提供的 ToolNode 抽象的自定义工具处理逻辑。然而，为了使用 bind_tools 将工具与模型集成，函数通常需要 @tool 装饰器，这引入了诸如 self 之类的参数挑战，这些参数不是由 LLM 提供的，但却是 Pydantic 验证所必需的。

The core value of LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models. lies in its scheduling, checkpointing, and state management capabilities. It's important to distinguish between LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models. (the graph execution engine) and LangChain (the broader ecosystem of abstractions), as they can be used with varying degrees of coupling.

LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models. 的核心价值在于其调度、检查点和状态管理能力。区分 LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models.（图执行引擎）和 LangChain（更广泛的抽象生态系统）非常重要，因为它们可以以不同程度的耦合来使用。

LlamaIndex WorkflowsLlamaIndex提供的工作流框架，用于构建和管理AI代理的复杂流程，提供高级抽象来简化代理开发。

Our evaluation of LlamaIndex WorkflowsLlamaIndex提供的工作流框架，用于构建和管理AI代理的复杂流程，提供高级抽象来简化代理开发。 highlighted its approach to structuring agent logic. The findings from this implementation contributed to our overall assessment of framework trade-offs regarding abstraction and control.

我们对 LlamaIndex WorkflowsLlamaIndex提供的工作流框架，用于构建和管理AI代理的复杂流程，提供高级抽象来简化代理开发。的评估突出了其构建智能体逻辑的方法。此实现的发现有助于我们对框架在抽象和控制方面的权衡进行整体评估。

Community Perspectives and Future Work

The Hacker News discussion revealed additional insights and tools. One developer introduced magentic, a Python library aiming to be a middle ground by handling retries, logging, tracing, and asyncio concurrency without enforcing specific prompts or patterns. This highlights the ongoing innovation in the space to find the "right" level of abstraction.

Hacker News 上的讨论揭示了更多的见解和工具。一位开发者介绍了 magentic，这是一个旨在成为中间地带的 Python 库，它处理重试、日志记录、跟踪和 asyncio 并发，而不强制执行特定的提示或模式。这凸显了在该领域为寻找“正确”的抽象级别而进行的持续创新。

There is also significant community interest in extending this comparison to other frameworks like CrewAI一个多智能体协作框架，允许创建和管理多个AI智能体协同完成任务。 and AutoGen微软开发的AI代理框架，支持多代理对话和协作，文章提到正在对其进行评估。. Future analyses of these tools will provide a more comprehensive view of the ecosystem.

社区也有浓厚的兴趣将这种比较扩展到其他框架，如 CrewAI一个多智能体协作框架，允许创建和管理多个AI智能体协同完成任务。 和 AutoGen微软开发的AI代理框架，支持多代理对话和协作，文章提到正在对其进行评估。。未来对这些工具的分析将为生态系统提供一个更全面的视角。

Conclusion and Resources

Choosing between a custom-coded agent and a framework-driven approach is not a one-size-fits-all decision. It requires careful consideration of:

Team Expertise: The development team's familiarity with agent concepts and the framework itself.
Project Complexity: The simplicity or complexity of the required agent logic.
Development Stage: The need for rapid prototyping versus long-term maintainability and control.
Debugging Needs: The importance of transparent, straightforward debugging processes.

在自定义编码的智能体和框架驱动的方法之间做出选择并非一刀切的决定。它需要仔细考虑：

团队专业知识： 开发团队对智能体概念和框架本身的熟悉程度。

项目复杂性： 所需智能体逻辑的简单性或复杂性。

开发阶段： 快速原型制作与长期可维护性和控制之间的需求。

调试需求： 透明、直接调试过程的重要性。

For those interested in delving deeper, we have made all resources publicly available:

Full Article on Towards Data Science: Choosing Between LLM Agent Frameworks
Complete Code Repository: GitHub - Arize-ai/phoenix
Interactive Traces in Arize Phoenix一个用于监控和调试AI系统的平台，在文章中用于捕获和分析不同框架实现的代理跟踪日志。:
- Pure Code: Phoenix Demo - Pure Code
- LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models.: Phoenix Demo - LangGraph
- LlamaIndex WorkflowsLlamaIndex提供的工作流框架，用于构建和管理AI代理的复杂流程，提供高级抽象来简化代理开发。: Phoenix Demo - Workflows

The ideal path forward likely involves a nuanced approach, potentially leveraging lighter-weight libraries for boilerplate reduction while maintaining core control over agent logic. As the ecosystem matures, we anticipate the emergence of tools that better balance power, simplicity, and transparency.

对于有兴趣深入探索的人，我们已公开所有资源：

Towards Data Science 上的完整文章： Choosing Between LLM Agent Frameworks

完整代码仓库： GitHub - Arize-ai/phoenix

Arize Phoenix一个用于监控和调试AI系统的平台，在文章中用于捕获和分析不同框架实现的代理跟踪日志。中的交互式追踪：

纯代码：Phoenix Demo - Pure Code

LangGraphA framework within the LangChain ecosystem for building stateful, multi-actor applications with language models.：Phoenix Demo - LangGraph

LlamaIndex WorkflowsLlamaIndex提供的工作流框架，用于构建和管理AI代理的复杂流程，提供高级抽象来简化代理开发。：Phoenix Demo - Workflows

理想的前进道路可能需要一种细致入微的方法，可能利用更轻量级的库来减少样板代码，同时保持对智能体逻辑的核心控制。随着生态系统的成熟，我们期待出现能够更好地平衡功能、简单性和透明度的工具。

常见问题（FAQ）

使用框架构建AI代理能够使用大型语言模型和工具自主执行任务、推理目标并做出决策的智能系统。会增加开发复杂性吗？

对于经验丰富的开发者，框架的抽象层可能增加不必要的复杂性，学习曲线可能超过简单架构的收益。

框架的并行功能会影响调试吗？

是的，内置并行执行会使跟踪执行流程和状态变化更困难，而自定义代码允许更精细的并发控制和逐步调试。

新手团队应该选择框架还是纯代码开发？

经验不足的团队更适合使用框架，它们提供结构、最佳实践和样板代码，能更快实现概念验证。