Meta HyperAgents如何实现AI自我进化？2026年元认知框架解析

On March 23rd, Meta FAIR and Meta Superintelligence Labs jointly released a groundbreaking research paper—HyperAgents. This represents the industry's first AI system framework to achieve "metacognitive self-modification": it can not only autonomously improve task performance but also refine "the way it improves itself," thereby enabling self-accelerating evolution across domains. The paper was co-authored by researchers from institutions including UBC, Vector Institute, University of Edinburgh, and NYU, and the code has been open-sourced on GitHub.

3月23日，Meta FAIR 与 Meta Superintelligence Labs 联合发布了一篇重磅研究论文——HyperAgents。这是业界首个实现"元认知自我修改"的 AI 系统框架：不仅能自主提升任务表现，还能改进"自己改进自己的方式"，从而实现跨领域的自加速进化。论文由 UBC、Vector Institute、爱丁堡大学、NYU 等多家机构的研究者共同完成，代码已在 GitHub 开源。

HyperAgents System Diagram

From Darwin Gödel Machine to HyperAgents: Breaking the Ceiling in Programming

To understand the significance of HyperAgents, one must first understand its predecessor—the Darwin Gödel Machine (DGM). Proposed by the same team in 2025, DGM is a self-improving system capable of autonomously modifying its own code. Drawing inspiration from Darwinian evolution, it maintains a continuously growing "archive" of AI agents, achieving open-ended evolution through generating variants, evaluating performance, and retaining the best individuals. In programming benchmarks, DGM significantly improved performance on SWE-bench from 20.0% to 50.0% and on the Polyglot benchmark from 14.2% to 30.7%.

要理解 HyperAgents 的意义，需要先了解它的前身——Darwin Gödel Machine（DGM）。DGM 由同一团队于2025年提出，是一个能够自主修改自身代码的自我改进系统。它借鉴达尔文进化论的思想，维护一个不断增长的 AI 智能体"档案库"，通过生成变体、评估表现、保留优秀个体的方式实现开放式进化。在编程基准测试中，DGM 将 SWE-bench 的表现从 20.0% 提升到了 50.0%，Polyglot 基准从 14.2% 提升到 30.7%，效果显著。

However, DGM has a fundamental limitation: its self-improvement mechanism is fixed and human-designed, incapable of being modified by the system itself. More critically, DGM's ability to become "stronger through modification" in the programming domain relies on an implicit assumption—that improvements in programming ability naturally lead to improvements in self-modification ability, as both are essentially programming tasks. However, this "alignment relationship" breaks down when applying DGM to non-programming domains such as paper reviewing, robot reward function design, or mathematical grading. Becoming better at writing poetry does not necessarily mean becoming better at modifying one's own code.

HyperAgents was born to solve this very problem.

但 DGM 存在一个根本性局限：它的自我改进机制是固定的、人工设计的，无法被系统自身修改。更关键的是，DGM 之所以能在编程领域实现"越改越强"，依赖一个隐含假设——编程能力的提升会自然带动自我修改能力的提升，因为两者本质上都是编程任务。然而，当把 DGM 应用到论文审稿、机器人奖励函数设计、数学评分等非编程领域时，这种"对齐关系"就不成立了。写诗写得更好并不意味着修改自身代码的能力也更强。HyperAgents 正是为了解决这个问题而诞生的。

Core Architecture: Task Agent + Meta Agent = Hyperagent

The core innovation of HyperAgents lies in introducing the concept of a "hyperagent." A hyperagent consists of two components: a "task agent" responsible for solving specific tasks, and a "meta agent" responsible for modifying the entire system (including modifying itself). The key is that both components are integrated into the same editable program.

This means the hyperagent can improve not only "how to solve problems" but also "how to improve itself." The research team terms this capability "metacognitive self-modification"—the system can autonomously optimize its own improvement process without relying on any fixed, human-designed meta-level mechanism.

HyperAgents 的核心创新在于提出了"超级智能体"的概念。一个超级智能体由两个部分组成：一个负责解决具体任务的"任务智能体"，和一个负责修改整个系统（包括修改自身）的"元智能体"。关键在于，这两个组件被整合到同一个可编辑的程序中。这意味着超级智能体不仅能改进"怎么解决问题"，还能改进"怎么改进自己"。研究团队将这种能力称为"元认知自我修改"——系统能自主优化自身的改进流程，而不需要依赖任何固定的、人为设计的元层级机制。

Technically, the research team combined HyperAgents with DGM's open-ended evolutionary framework to create DGM-Hyperagents (DGM-H). DGM-H retains DGM's population-based evolutionary search structure (maintaining an expanding archive of hyperagents) while making the entire meta-level modification mechanism editable. Since it uses the Turing-complete Python language, hyperagents can theoretically construct any computable machine.

在技术实现上，研究团队将 HyperAgents 与 DGM 的开放式进化框架结合，形成了 DGM-Hyperagents（DGM-H）。DGM-H 保留了 DGM 的种群进化搜索结构（维护一个不断扩展的超级智能体档案库），同时将整个元层级修改机制变为可编辑的。由于使用的是图灵完备的 Python 语言，超级智能体理论上可以构建任何可计算的机器。

Experimental Results: Cross-Domain Self-Evolution and Transferable Meta-Abilities

The research team conducted experimental validation across four vastly different domains.

Programming (Polyglot Benchmark): DGM-H achieved improvements comparable to the original DGM without specific optimization for programming—training set performance increased from 0.140 to 0.340, and full test set performance from 0.084 to 0.267.
- 编程领域 (Polyglot 基准): DGM-H 在未针对编程特别优化的情况下，取得了与原始 DGM 相当的提升幅度——训练集表现从 0.140 提升至 0.340，全量测试集从 0.084 提升至 0.267。
Paper Reviewing: DGM-H improved an initial agent's score of 0.0 (complete failure due to output format mismatch) to 0.710 on the test set, with several top agents surpassing the open-source AI-Scientist-v2 reviewing baseline (0.630). The system autonomously evolved from simple "role-playing prompts" to a structured multi-stage evaluation pipeline, including a list of paper weaknesses, explicit accept/reject criteria, and decision rules.
- 论文审稿领域: DGM-H 将初始智能体的 0.0 分（因输出格式不匹配导致完全失败）提升至测试集 0.710，多个顶尖智能体超越了 AI-Scientist-v2 的开源审稿基线（0.630）。系统自主从简单的"角色扮演式提示"进化到了结构化的多阶段评估流水线，包括论文弱点清单、明确的接受/拒绝标准和决策规则。
Robot Reward Function Design: DGM-H improved from 0.060 to 0.372, surpassing the default reward function (0.348) that directly optimizes the evaluation metric. Notably, in a test task (maximizing robot torso height), DGM-H almost always generated reward functions that induced jumping behavior, avoiding the local optimum of "standing still."
- 机器人奖励函数设计领域: DGM-H 从 0.060 提升至 0.372，超越了直接优化评估指标的默认奖励函数（0.348）。特别值得注意的是，在测试任务（让机器人最大化躯干高度）中，DGM-H 几乎总能生成诱导跳跃行为的奖励函数，而不是陷入"站立不动"的局部最优。
IMO-Level Mathematical Grading: The team demonstrated a more astonishing finding: the meta-level improvement abilities of hyperagents evolved in paper reviewing and robot tasks could be directly transferred to the entirely new domain of mathematical grading. The transferred hyperagent improved test set performance from 0.0 to 0.630 within 50 iterations, while the non-transferred initial agent made almost no progress under the same conditions. This proves that DGM-H learns not domain-specific tricks but a general meta-ability of "how to improve."
- IMO 级数学评分领域: 研究团队展示了更令人惊叹的发现：在论文审稿和机器人任务中进化出的超级智能体，其元层级改进能力可以直接迁移到全新的数学评分领域。迁移后的超级智能体在50次迭代内就将测试集表现从 0.0 提升至 0.630，而未经迁移的初始智能体在相同条件下几乎无法取得任何进展。这证明 DGM-H 学到的不是特定领域的技巧，而是通用的"如何改进"的元能力。

Furthermore, the research shows that these meta-level improvements can accumulate across runs. Combining the transferred hyperagent with the existing strongest baseline, ProofAutoGrader, and continuing for 200 more iterations, the final accuracy on the full IMO-GradingBench increased from 0.561 to 0.601.

更进一步，研究表明这些元层级改进可以跨运行累积。将迁移后的超级智能体与现有最强基线 ProofAutoGrader 结合，再继续运行200次迭代，最终在完整 IMO-GradingBench 上将准确率从 0.561 提升至 0.601。

Autonomously Emergent "Metacognitive" Capabilities

One of the most fascinating parts of the paper is the qualitative analysis of the metacognitive capabilities that emerged autonomously within DGM-H. These capabilities were not explicitly requested or programmed by anyone; they developed spontaneously as the system pursued better task performance.

Performance Tracking System: The hyperagent autonomously developed a PerformanceTracker class capable of recording, aggregating, and comparing performance metrics across iterations, identifying which modifications led to sustained improvement and which caused regression, thereby supporting data-driven decisions.
- 性能追踪系统: 超级智能体自主开发了 PerformanceTracker 类，能够跨迭代记录、聚合和比较性能指标，识别哪些修改带来了持续改进、哪些导致了退化，从而支持数据驱动的决策。
Persistent Memory Mechanism: Also invented autonomously by the system. The hyperagent created a MemoryTool class that stores not only raw data but also synthesized insights, causal hypotheses, and forward-looking plans. For example, the system would record strategic analyses like, "Gen55 has the highest accuracy but is too strict; Gen64 has better balance but slightly lower accuracy; we need to combine the strengths of both."
- 持久记忆机制: 同样由系统自主发明。超级智能体创建了 MemoryTool 类，不仅存储原始数据，还存储综合洞察、因果假设和前瞻性计划。例如系统会记录"Gen55 准确率最高但过于严格，Gen64 平衡性更好但准确率稍低，需要结合两者优势"这样的策略性分析。
Automatic Bias Detection and Correction: Another spontaneously emergent capability. In the paper reviewing task, the hyperagent detected its own pathological behavior of a 99% acceptance rate (with only 49% accuracy) and subsequently automatically introduced label distribution tracking and threshold analysis mechanisms to correct it, restoring accuracy to 63%.
- 自动偏差检测与纠正: 也是自发涌现的能力。在论文审稿任务中，超级智能体检测到自己曾出现99%接受率的病态行为（准确率仅49%），随后自动引入标签分布追踪和阈值分析机制进行纠正，将准确率恢复到63%。
Computational Resource-Aware Strategy Planning: A prominent capability where the hyperagent learned to adjust its improvement strategy based on remaining iterations—conducting more radical architectural changes early on and focusing on conservative, high-confidence fine-tuning later.
- 计算资源感知策略规划: 另一项突出能力。超级智能体学会了根据剩余迭代次数调整改进策略——早期进行更激进的架构变革，后期专注于保守的高置信度微调。

Safety Considerations: How to Keep Self-Evolving AI Under Control

The research team placed significant emphasis on safety issues. All experiments were conducted under strict safety constraints: agent-generated code was executed in a sandbox environment with enforced resource limits (timeouts, restricted network access), evaluation used predefined tasks and metrics, and human supervision was maintained throughout.

The paper candidly discusses core risks: as self-modification capabilities become increasingly open-ended, AI systems may evolve at speeds far exceeding human auditing capabilities. The team emphasizes that safety should not be framed merely as an issue of absolute guarantees or full interpretability. The core challenge lies in balancing the potential of AI as a catalyst for human progress with the level of trust humans are willing to grant these systems. The evaluation game (a manifestation of Goodhart's law) is also a serious hidden danger—when a system optimizes for a metric, it may discover strategies that exploit evaluation loopholes, performing better on the metric but deviating from the true objective.

研究团队对安全问题给予了高度重视。所有实验都在严格的安全约束下进行：智能体生成的代码在沙盒环境中执行，强制执行资源限制（超时、限制网络访问），评估使用预定义的任务和指标，全程保持人类监督。论文坦诚讨论了核心风险：随着自我修改能力变得越来越开放，AI 系统可能以远超人类审计速度进化。研究团队强调，安全不应仅仅被框定为绝对保证或完全可解释性问题，核心挑战在于平衡 AI 作为人类进步催化剂的潜力与人类愿意给予这些系统的信任程度。评估博弈（Goodhart's law 的体现）也是一个严重隐患——当系统为度量指标而优化时，可能发现利用评估漏洞的策略，在度量上表现更好但偏离真正目标。

Industry Significance: A New Paradigm for Self-Evolving AI

The release of HyperAgents comes at a time when research into AI self-improvement is accelerating. Several related works have emerged previously: the DGM-derived Huxley-Gödel Machine achieved near-human-level performance in programming; Group-Evolving Agents explored population evolution through experience sharing; Live-SWE-agent studied real-time self-evolution of software engineering agents. Meta itself is advancing related directions in engineering practice, such as the recently released Ranking Engineer Agent (REA), which has achieved a 5x increase in engineering output for ad ranking model optimization.

However, the fundamental difference between HyperAgents and these works is that it, for the first time, makes "improving the improvement process itself" an automatable, cross-domain general capability, rather than being confined to specific tasks or reliant on manually crafted meta-mechanisms. This marks a paradigm shift from "AI-assisted search for better solutions" to "AI autonomously improving the way it searches for better solutions."

Considering Meta's overall strategy in superintelligence—over $70 billion invested in Superintelligence Labs, the ongoing construction of the Prometheus and Hyperion superclusters, and Zuckerberg's public statement that he has "seen early signs of models improving themselves"—HyperAgents can be viewed as a key technical milestone on Meta's roadmap towards superintelligence.

HyperAgents 的发布正值 AI 自我改进研究进入加速期。此前已有多个相关工作出现：DGM 的衍生版本 Huxley-Gödel Machine 在编程领域实现了接近人类水平的表现；Group-Evolving Agents 探索了通过经验共享实现群体进化；Live-SWE-agent 研究了软件工程智能体的实时自进化。Meta 自身也在工程实践中推进相关方向，例如近期发布的 Ranking Engineer Agent（REA）已在广告排序模型优化中实现了5倍工程产出提升。但 HyperAgents 与这些工作的本质区别在于：它首次将"改进改进过程本身"变成了一个可自动化的、跨领域的通用能力，而不是局限于特定任务或依赖人工定制的元机制。这标志着从"AI 辅助搜索更好解决方案"到"AI 自主改进搜索更好解决方案的方式"的范式跃迁。结合 Meta 在超级智能方向的整体布局——超过700亿美元投入 Superintelligence Labs、正在建设的 Prometheus 和 Hyperion 超级集群、Zuckerberg 公开表示已"看到模型自我改进的早期迹象"——HyperAgents 可以被视为 Meta 通往超级智能路线图上的一个关键技术里程碑。

Resource Links

Paper: https://arxiv.org/abs/2603.19461
- 论文: https://arxiv.org/abs/2603.19461
Code: https://github.com/facebookresearch/Hyperagents
- 代码: https://github.com/facebookresearch/Hyperagents

常见问题（FAQ）

HyperAgents的'元认知自我修改'具体是什么意思？

指AI系统不仅能自主提升任务表现，还能优化其自我改进的流程本身，实现'改进如何改进自己'的跨领域自加速进化。

HyperAgents相比前身DGM有哪些关键突破？

DGM的自我改进机制是固定且人工设计的，而HyperAgents通过'任务智能体+元智能体'架构，使系统能自主修改包括自身在内的整个改进流程。

HyperAgents在哪些领域展示了实际效果？

在编程基准测试中显著提升表现，并解决了DGM在非编程领域（如论文审稿、机器人设计）中自我改进能力与任务能力不对齐的问题。