GEO

如何在GCE上部署Hermes Agent并集成Gemini 2.5 Flash?2026年完整部署教程

2026/5/3
如何在GCE上部署Hermes Agent并集成Gemini 2.5 Flash?2026年完整部署教程

AIAI Summary (BLUF)

Step-by-step guide to deploy Hermes Agent on GCE with Gemini 2.5 Flash, including configuration, common pitfalls (e.g., 404 errors from incorrect model identifiers), and Systemd-based background persi

Background

在解决了 LINE Bot 的 Vertex AI 迁移问题后,我开始思考:是否存在一个“更主动”、“具备长期记忆”的 AI 助手?于是,我将目光投向了 NousResearch 的开源项目 Hermes Agent

After solving the LINE Bot's Vertex AI migration, I started thinking: Could there be an AI assistant that is "more proactive" and "has long-term memory"? At this time, I set my sights on NousResearch's open-source Hermes Agent.

与典型的聊天机器人不同,Hermes 被设计为一个“会呼吸的操作系统”。它可以执行 Shell 命令、编写 Python 脚本、管理长期记忆,甚至能通过 Telegram、Discord 等不同网关随时与你保持联系。

Unlike a typical Chatbot, Hermes is designed as an "operating system that breathes". It can execute Shell commands, write Python scripts, manage long-term memory, and even stay in touch with you via different Gateways (Telegram, Discord) at any time.

为了让它 7×24 小时可用,我选择将其部署在 Google Compute Engine (GCE) 上。本文将详细记录从零开始的部署过程,以及在配置最新 Gemini 2.5 Flash 模型时遇到的各类陷阱。

To make it available 24/7, I chose to deploy it on Google Compute Engine (GCE). This article will document the deployment process from scratch, as well as the pitfalls I encountered when configuring the latest Gemini 2.5 Flash model.


Environment Parameter Preparation

在开始之前,请确保已准备以下必要参数:

Before you start, please make sure you have these necessary parameters:

  • PROJECT_IDYOUR_PROJECT_ID
  • LOCATIONglobal
  • GOOGLE_API_KEYYOUR_GOOGLE_API_KEY(从 Google AI Studio 获取)
  • PROJECT_ID: YOUR_PROJECT_ID
  • LOCATION: global
  • GOOGLE_API_KEY: YOUR_GOOGLE_API_KEY (Obtained from Google AI Studio)

Step 1: Create a GCE Instance

Hermes Agent 需要一定的计算能力来处理工具调用,建议使用 e2-medium 规格。

Hermes Agent needs some computing power to handle tool use. It is recommended to use the e2-medium specification.

gcloud compute instances create hermes-agent-vm \
    --project=YOUR_PROJECT_ID \
    --zone=us-central1-a \
    --machine-type=e2-medium \
    --image-family=ubuntu-2204-lts \
    --image-project=ubuntu-os-cloud \
    --boot-disk-size=30GB \
    --metadata=startup-script='#!/bin/bash
        apt-get update
        apt-get install -y git curl python3-pip python3-venv nodejs npm
    '

Step 2: Install Hermes Agent

SSH 进入虚拟机后,直接使用官方一键安装脚本。

After SSHing into the VM, use the official one-click installation script directly.

1. 进入虚拟机 / Enter the VM

gcloud compute ssh hermes-agent-vm --zone=us-central1-a

2. 执行安装 / Execute the installation

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
source ~/.bashrc

Step 3: Configure Gemini 2.5 Flash (SOP Practice)

这是整个实践中最容易踩坑的地方。Hermes 可能默认指向不存在或过时的模型标识符。

This is the most likely place to step on a landmine in the entire exercise. Hermes may default to pointing to non-existent or outdated model identifiers.

  1. 创建配置文件 / Create a configuration file:在 ~/.hermes/config.yaml 中,我们必须精确指定 Gemini 2.5 Flash,并且不要包含 google/ 前缀

  2. 设置 API 密钥 / Set the API Key:将密钥和权限设置写入 ~/.hermes/.env 文件中。


Step 4: Connect to Telegram and Background Persistence

为了防止 SSH 连接断开后 Agent 消失,我们使用 Systemd 来管理它。

To prevent the Agent from disappearing after the SSH connection is lost, we use Systemd to manage it.

1. 创建 Systemd 服务 / Create a Systemd service/etc/systemd/system/hermes.service):

[Unit]
Description=Hermes Agent Gateway
After=network.target

[Service]
Type=simple
User=root
Environment=HOME=/root
Environment=PYTHONUNBUFFERED=1
ExecStart=/usr/local/lib/hermes-agent/venv/bin/hermes gateway run
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

2. 启动服务 / Start the service

sudo systemctl daemon-reload
sudo systemctl enable hermes
sudo systemctl restart hermes

Blood and Tears in the Migration Process: Troubleshooting Agent Unresponsiveness

即便配置正确,我仍然遇到了“Agent 能读取消息但不回复”的困境。通过检查日志(journalctl -u hermes),我发现了几处深层陷阱:

Even with the correct configuration, I still encountered the dilemma of "the Agent reads messages but doesn't reply". After checking the logs (journalctl -u hermes), I found several deep pitfalls:

Pitfall 1: The 404 Ghost of Gemini 3.0

我在配置时试图追求最新版本,使用了 gemini-3-flash-preview。结果日志中大量输出 404 Model Not Found 错误。

I tried to pursue the latest version when configuring, and used gemini-3-flash-preview. As a result, the logs spewed out a bunch of 404 Model Not Found errors.

原因 / Reason:Hermes 内部的 auxiliary_client.py 硬编码了许多 gemini-3-flash-preview 作为默认值。当这些辅助功能(如生成标题)报错时,会影响整个网关的回复逻辑。

The internal auxiliary_client.py of Hermes hardcodes many gemini-3-flash-preview as the default value. When these auxiliary functions (such as generating titles) report errors, it will affect the reply logic of the entire Gateway.

解决方案 / Solution:在 config.yaml 中将所有 auxiliary 模型手动定义为 gemini-2.5-flash,或者直接用 sed 命令修改源代码。

Manually define all auxiliary models as gemini-2.5-flash in config.yaml, or directly patch the source code with sed.

Pitfall 2: Prefix Confusion of Model Identifiers

在不同 SDK 中,有些人使用 google/gemini-2.5-flash,有些人使用 gemini-2.5-flash

In different SDKs, some people use google/gemini-2.5-flash, and some people use gemini-2.5-flash.

经验 / Experience:在 Hermes 的 Gemini Provider 中,直接使用短名称 gemini-2.5-flash 是最安全的。添加 google/ 反而会导致 API 路由错误。

In Hermes' Gemini Provider, using the short name gemini-2.5-flash directly is the safest. Adding google/ will instead cause API routing errors.

Pitfall 3: Conflict between Systemd and Already Running Processes

当你手动运行 hermes gateway 后,再启动服务时,系统会报告 Gateway already running (PID xxxx)

When you manually run hermes gateway and then start the service, the system will report Gateway already running (PID xxxx).

解决方案 / Solution:在 SystemdExecStart 之前添加 ExecStartPre=/usr/bin/pkill -9 -f hermes || true,确保每次启动时环境干净。

Before ExecStart in Systemd, you can add an ExecStartPre=/usr/bin/pkill -9 -f hermes || true to ensure a clean environment every time you start.

Pitfall Comparison Table

陷阱名称 / Pitfall Name 关键原因 / Key Cause 解决方案 / Solution
Gemini 3.0 的 404 幽灵 / The 404 Ghost of Gemini 3.0 内部 auxiliary_client.py 硬编码了 gemini-3-flash-preview 作为默认值 config.yaml 中手动定义所有辅助模型为 gemini-2.5-flash,或直接使用 sed 修改源代码
模型标识符的前缀混淆 / Prefix Confusion of Model Identifiers google/gemini-2.5-flashgemini-2.5-flash 在不同 SDK 中混用 在 Hermes 的 Gemini Provider 中直接使用短名称 gemini-2.5-flash,避免添加 google/ 前缀
Systemd 与已运行进程的冲突 / Conflict between Systemd and Already Running Processes 手动运行 hermes gateway 后,Systemd 服务启动时检测到重复进程 SystemdExecStart 前添加 ExecStartPre=/usr/bin/pkill -9 -f hermes || true,确保启动环境干净

Summary

现在,我的专属 Hermes Agent 已在 GCE 上稳定运行,并可通过 Telegram 随时访问。它不仅可以帮助我查找信息,还能直接在云端虚拟机上为我运行一些简单的计算脚本。

Now, my dedicated Hermes Agent is running stably on GCE and is available via Telegram at any time. It can not only help me find information, but also run some simple computing scripts for me directly on the cloud VM.

这次部署让我深刻认识到:面对快速更新的模型,官方文档(或 MCP 工具查询)才是唯一的真理。不要盲目追求最新版本号,确保标识符与当前 API 环境匹配才是稳定运行的关键。

This deployment taught me: In the face of rapidly updating models, the official documentation (or MCP tool query) is the only truth. Don't blindly pursue the latest version number; ensuring that the identifier matches the current API environment is the key to stable operation.

如果你也希望拥有一个 7×24 小时的 AI 数字分身,就按照这份 SOP 搭建一台专属机器吧!

If you also want a 24-hour AI digital double, get a machine set up according to this SOP!

常见问题(FAQ)

部署Hermes Agent时如何避免Gemini 2.5 Flash的404错误?

配置模型标识符时不要加google/前缀,使用精确名称如gemini-2.5-flash-preview-04-17,并在config.yaml中正确设置api_key和model参数。

Hermes AgentGCE上如何实现后台持久运行?

使用Systemd创建服务,编写/etc/systemd/system/hermes.service,设置ExecStart为hermes gateway run,并启用restart自动重启。

部署Hermes Agent前需要准备哪些Google Cloud参数?

需要PROJECT_ID(项目ID)、LOCATION(通常设为global)和GOOGLE_API_KEY(从Google AI Studio获取)。

← 返回文章列表
分享到:微博

版权与免责声明:本文仅用于信息分享与交流,不构成任何形式的法律、投资、医疗或其他专业建议,也不构成对任何结果的承诺或保证。

文中提及的商标、品牌、Logo、产品名称及相关图片/素材,其权利归各自合法权利人所有。本站内容可能基于公开资料整理,亦可能使用 AI 辅助生成或润色;我们尽力确保准确与合规,但不保证完整性、时效性与适用性,请读者自行甄别并以官方信息为准。

若本文内容或素材涉嫌侵权、隐私不当或存在错误,请相关权利人/当事人联系本站,我们将及时核实并采取删除、修正或下架等处理措施。 也请勿在评论或联系信息中提交身份证号、手机号、住址等个人敏感信息。