Forge推理API和Nous Chat哪个更好用?2026年最新AI推理平台实测对比
Nous Research launches Forge Reasoning API Beta and Nous Chat platform, enhancing Hermes 70B model with Monte Carlo Tree Search, Chain of Code, and Mixture of Agents techniques to compete with larger models in reasoning benchmarks.
原文翻译: Nous Research推出Forge推理API测试版和Nous Chat平台,通过蒙特卡洛树搜索、代码链和智能体混合技术增强Hermes 70B模型,在推理基准测试中与更大模型竞争。
Introduction
At Nous Research, we are excited to announce the launch of two new projects: the Forge Reasoning API Beta and Nous Chat, a streamlined chat platform powered by the Hermes language model. The Forge Reasoning API embodies our latest advancements in inference-time AI research, building upon the foundational work of the original Hermes model.
在 Nous Research,我们很高兴地宣布推出两个新项目:Forge 推理 API 测试版和 Nous Chat。Nous Chat 是一个由 Hermes 语言模型驱动的简洁聊天平台。Forge 推理 API 融合了我们在推理时 AI 研究方面的最新进展,其基础是原始的 Hermes 模型。
Nous Chat: A Dedicated Interface for Hermes 3
Nous Chat is our dedicated platform for interacting with the powerful Hermes 3 70B language model. We've designed a sleek interface that puts sophisticated AI capabilities at your fingertips while maintaining ease of use.
Nous Chat 是我们用于与强大的 Hermes 3 70B 语言模型交互的专用平台。我们设计了一个简洁的界面,让您轻松使用先进的 AI 功能,同时保持了易用性。
Hermes 3 is an open-source language model built for higher expression, long-form thinking, and individual alignment. At hermes.nousresearch.com, our threaded conversation system helps you organize your thoughts and projects, while the system prompt and configuration options give you full control over your AI interactions. Whether you're conducting analysis, exploring future scenarios, or seeking practical advice, Nous Chat provides the focused environment you need to make the most of our beloved open-source model.
Hermes 3 是一个为高级表达、长文本思考和个性化对齐而构建的开源语言模型。在 hermes.nousresearch.com 上,我们的线程对话系统帮助您组织思路和项目,而系统提示词和配置选项让您完全掌控 AI 交互。无论是进行分析、探索未来场景还是寻求实用建议,Nous Chat 都为您提供了一个专注的环境,以充分利用我们钟爱的开源模型。
Nous Chat is available at hermes.nousresearch.com and is currently free!
Nous Chat 可在 hermes.nousresearch.com 访问,目前免费!
How Does Forge Affect the LLM Ecosystem?
The Forge Reasoning API allows you to take any popular model and supercharge it with a code interpreter and advanced reasoning capabilities. Our evaluations demonstrate that Forge augments the Hermes 70B model to be competitive with much larger models from Google, OpenAI, and Anthropic in reasoning benchmarks.
Forge 推理 API 允许您为任何流行模型赋能,为其添加代码解释器和高级推理能力。我们的评估表明,Forge 增强了 Hermes 70B 模型,使其在推理基准测试中能够与 Google、OpenAI 和 Anthropic 的更大模型竞争。
Hermes 70B x Forge outperforms larger models in the AIME evaluation specifically. This metric focuses on competition-grade math questions—the AIME competition is one of the two tests used to determine eligibility for the US Math Olympiad and has been used as a standard for similar reasoning systems in the past.
Hermes 70B x Forge 在 AIME 评估中尤其超越了更大的模型。该指标侧重于竞赛级别的数学问题——AIME 竞赛是用于确定美国数学奥林匹克参赛资格的两项测试之一,过去一直被用作类似推理系统的标准。
Benchmarks demonstrate one perception of any LLM technology; we're most interested in real-world use cases and the vaunted "vibe checks" that can come from field testing of inference. We are currently exploring further how the Forge Reasoning system impacts frontier closed models.
基准测试展示了人们对 LLM 技术的一种看法;我们最感兴趣的是实际用例以及来自推理现场测试的著名 "氛围检查"。我们目前正在进一步探索 Forge 推理系统如何影响前沿的闭源模型。
The Forge Reasoning API (Beta)
The Forge Reasoning API will be available in Beta for a select group of users starting this week. "Forge" integrates multiple research breakthroughs, including our Hermes model family, Mixture of Agents, Chain of Code, and Monte Carlo Tree Search, to create a comprehensive system for enhanced reasoning capabilities.
Forge 推理 API 将于本周开始向特定用户群体开放测试版。"Forge" 融合了多项研究突破,包括我们的 Hermes 模型家族、智能体混合、代码链和蒙特卡洛树搜索,以创建一个用于增强推理能力的综合系统。
The Beta phase will focus on testing the architecture of our reasoning system. Power users of Hermes are included in the initial beta testing group for Forge because we know they are capable of unlocking and battle-testing the primitives in the API. Our compute partner for the Beta is Lambda.
测试阶段将侧重于测试我们推理系统的架构。Hermes 的高级用户被纳入 Forge 的初始测试组,因为我们知道他们有能力解锁并对 API 中的基础功能进行实战测试。我们测试阶段的算力合作伙伴是 Lambda。
The Forge Reasoning API represents an advancement and innovation in LLM inference, designed to elevate LLMs to new heights of reasoning.
Forge 推理 API 代表了 LLM 推理领域的一项进步和创新,旨在将 LLM 的推理能力提升到新的高度。
The Model Layer: Freedom of Choice
Understanding the importance of flexibility, we've designed Forge to support multiple models, including:
认识到灵活性的重要性,我们设计了 Forge 以支持多种模型,包括:
- Hermes 3 (Hermes 3)
- Claude Sonnet 3.5 (Claude Sonnet 3.5)
- Gemini (Gemini)
- GPT 4 (GPT 4)
Users can either utilize a single model to drive their Monte Carlo Tree Search implementation or combine multiple models to enhance output diversity.
用户既可以使用单一模型来驱动其蒙特卡洛树搜索实现,也可以组合多个模型以增强输出的多样性。
The Reasoning Layer: Dual Approaches to Reasoning
The Forge Reasoning API is built upon three architectures developed through our research:
Forge 推理 API 建立在我们研究开发的三种架构之上:
1. MCTS (Monte Carlo Tree Search)
MCTS is a research area we focused on for the past year. Monte Carlo Tree Search (MCTS) is particularly useful in planning problems. The architecture iteratively builds a decision tree by selecting promising nodes based on an exploration-exploitation balance, simulating random actions until a terminal state, and then backpropagating the results to update node values.
MCTS 是我们过去一年重点研究的一个领域。蒙特卡洛树搜索在规划问题中特别有用。该架构通过基于探索-利用平衡选择有希望的节点、模拟随机动作直到终止状态,然后反向传播结果以更新节点值,从而迭代地构建决策树。
The architecture operates through four key phases:
该架构通过四个关键阶段运行:
- Selection (选择): Identifying promising nodes for exploration (识别有探索价值的节点)
- Expansion (扩展): Adding new decision nodes (添加新的决策节点)
- Simulation (模拟): Testing random action sequences (测试随机动作序列)
- Backpropagation (反向传播): Updating node statistics based on simulation results (根据模拟结果更新节点统计信息)
2. CoC (Chain of Code)
CoC (or Chain of Code) is a series of reasoning steps, known as a Chain of Thought, connected to a code interpreter. CoC allows for vast improvements in code and math capabilities when using the API. Most "real-life" math and code-based problems are deeply woven into semantic structure (e.g., "After the new policy that was passed in January 2025, how much tax will I pay on a can of Pepsi in New York?"), and CoC is built particularly to tackle issues of this nature.
CoC 是一系列推理步骤,即思维链,与代码解释器相连。CoC 在使用 API 时能极大提升代码和数学能力。大多数"现实生活"中基于数学和代码的问题都深深嵌入语义结构中(例如,"在 2025 年 1 月通过新政策后,我在纽约买一罐百事可乐要交多少税?"),而 CoC 的构建正是为了处理这类性质的问题。
3. MoA (Mixture of Agents)
Perhaps each model you use to solve a problem is only seeing part of the picture—enter Mixture of Agents (MoA). We can allow many models to respond to a query, confer with one another, and synthesize new answers. The consensus of LLMs judges the best answer, resulting in a more complete and diverse output than one model can provide alone. MoA can be used alongside the other techniques by simply selecting more than one model in the API.
也许您用来解决问题的每个模型都只看到了部分图景——这就是智能体混合的用武之地。我们可以让多个模型响应一个查询,相互商议,并综合出新的答案。LLM 的共识会判断出最佳答案,从而产生比单一模型更完整、更多样化的输出。MoA 可以与其他技术一起使用,只需在 API 中选择多个模型即可。
A side-by-side comparison between the single-turn output of Forge and o1 preview demonstrates how Forge contains nuance and elasticity to enable many choices, so the user can ultimately decide:
Forge 与 o1 preview 的单轮输出并列比较展示了 Forge 如何包含细微差别和弹性以提供多种选择,从而让用户最终决定:
Core Technology Comparison
The following table provides a concise comparison of the core reasoning techniques integrated into the Forge API, highlighting their primary functions and key characteristics.
下表简要比较了集成到 Forge API 中的核心推理技术,突出了它们的主要功能和关键特性。
| Technique (技术) | Primary Function (主要功能) | Key Characteristic (关键特性) |
|---|---|---|
| MCTS (Monte Carlo Tree Search) | Planning and decision-making in complex state spaces (在复杂状态空间中规划和决策) | Exploration-exploitation balance; iterative tree building (探索-利用平衡;迭代式树构建) |
| CoC (Chain of Code) | Enhancing code execution and semantic math reasoning (增强代码执行和语义数学推理) | Connects reasoning chains to a code interpreter (将推理链连接到代码解释器) |
| MoA (Mixture of Agents) | Aggregating and refining outputs from multiple LLMs (聚合并精炼来自多个 LLM 的输出) | Consensus-based answer synthesis; increases diversity (基于共识的答案综合;增加多样性) |
Conclusion
The Forge Reasoning API represents our vision for the future of LLM technology: offering unprecedented capabilities in reasoning and autonomous operation. The API Beta is opening today; while there are some limitations (single-turn capabilities only), we plan to expand the capabilities of the engine rapidly with feedback from a small group of users. We look forward to working with our community.
Forge 推理 API 代表了我们对于 LLM 技术未来的愿景:提供前所未有的推理和自主操作能力。API 测试版于今日开放;虽然存在一些限制(目前仅支持单轮对话),但我们计划根据一小部分用户的反馈,快速扩展引擎的能力。我们期待与我们的社区合作。
Sign up for our research updates HERE ⇨
在此注册以获取我们的研究更新 HERE ⇨
常见问题(FAQ)
Nous Research的Forge推理API如何提升Hermes 70B模型的性能?
Forge推理API通过集成蒙特卡洛树搜索、代码链和智能体混合技术,为Hermes 70B模型添加代码解释器和高级推理能力,使其在AIME等推理基准测试中能与更大模型竞争。
Nous Chat平台有哪些主要功能和特点?
Nous Chat是专为Hermes 3 70B模型设计的聊天平台,提供简洁界面、线程对话系统、系统提示词配置选项,目前可免费访问hermes.nousresearch.com,支持分析、场景探索等任务。
Forge推理API目前处于什么阶段?如何获取使用权限?
Forge推理API目前处于测试版阶段,本周开始向特定用户群体开放,集成了Hermes模型家族、智能体混合、代码链和蒙特卡洛树搜索等多项研究突破。
版权与免责声明:本文仅用于信息分享与交流,不构成任何形式的法律、投资、医疗或其他专业建议,也不构成对任何结果的承诺或保证。
文中提及的商标、品牌、Logo、产品名称及相关图片/素材,其权利归各自合法权利人所有。本站内容可能基于公开资料整理,亦可能使用 AI 辅助生成或润色;我们尽力确保准确与合规,但不保证完整性、时效性与适用性,请读者自行甄别并以官方信息为准。
若本文内容或素材涉嫌侵权、隐私不当或存在错误,请相关权利人/当事人联系本站,我们将及时核实并采取删除、修正或下架等处理措施。 也请勿在评论或联系信息中提交身份证号、手机号、住址等个人敏感信息。