GEO

如何用AI代理从手机远程控制AI编码智能体?2026年实战经验

2026/5/1
如何用AI代理从手机远程控制AI编码智能体?2026年实战经验

AIAI Summary (BLUF)

This article describes how the author forked the open-source Paseo app and used AI coding agents to enhance its support for mobile remote control of coding agents, focusing on session handoff, default

Introduction

TLDR - I wanted to upgrade my coding-from-my-phone workflow. I found Paseo, forked it, and used AI agents to make the parts I cared about work better for me.

TLDR - 我想优化自己用手机编码的工作流程。我找到了 Paseo,将其 fork 出来,并利用 AI 智能体改进了我所关心的部分,使其更好用。

I'm spending a few weeks thinking about the process of coding with AI agents. Not just "which model is best" or "which harness performs best," but my actual mechanics. How I use these tools day-to-day. Where sessions live. How I know what an agent is doing when I am not staring at the same terminal.

我花了数周时间思考使用 AI 智能体编码的整个过程。不仅仅是"哪个模型最好"或"哪个框架性能最佳",而是我实际的操作机制:我每天如何使用这些工具、会话存放在哪里、以及当我不盯着同一个终端时,如何知道智能体正在做什么。

On my laptop and desktop, I am slowly getting that into shape. The missing piece is my phone. I am admittedly obsessed with vibe coding, and lugging my laptop around everywhere is a bit of a social faux pas. Or at least I feel like it is.

在我的笔记本电脑和台式机上,我正慢慢地整理好这些流程。缺失的一环是我的手机。坦白说,我对"氛围编码"(vibe coding)很着迷,但走到哪里都背着笔记本电脑有点社交失礼——至少我是这么感觉的。

I want to be able to check on agents, answer prompts, nudge a task forward, and occasionally kick off work when I do not have a computer in front of me.

我希望能够随时查看智能体的状态、回答提示、推动任务前进,偶尔在没有电脑的情况下启动工作。

The Problem: Finding a Mobile Coding Workflow

The tool I wanted needed to match how I actually work:

我想要的工具必须匹配我的实际工作方式:

  • Linux support
  • Android support
  • Local coding, not a cloud service or Saas
  • Support for the agents I already use, especially OpenCode
  • A way to continue or inspect sessions from my real dev environment
  • Linux 支持
  • Android 支持
  • 本地编码,而非云服务或 SaaS
  • 支持我已有的智能体,特别是 OpenCode
  • 能够从真实开发环境中继续或检查会话

SSH on a phone technically exists, but that is not the workflow I wanted. I wanted a control plane, not a tiny, inconvenient terminal.

手机上的 SSH 技术上是存在的,但那不是我想要的工作流程。我需要的是一个控制面板,而不是一个微小且不便的终端。

Paseo: The Foundation That Mostly Worked

Paseo was the first thing I found that felt shaped like the problem I actually had.

Paseo 是我找到的第一个感觉正符合我实际问题的工具。

There are other apps in this space. I also looked at Emdash, SoloTerm, and a handful of other projects. I suspect that Google, Anthropic, OpenAI, and other providers will continue to develop their own handoff to phone workflows. But I wanted one consistent layer across agents.

这个领域还有其他应用。我还研究了 EmdashSoloTerm 和其他几个项目。我推测谷歌、Anthropic、OpenAI 等提供商将继续开发他们自己的手机端工作流。但我想要一个跨智能体的统一层。

Tool Core Value Proposition Open Source Linux Support Android Support Local Coding Handoff from Terminal Manual Tuning Needed
Paseo Local-first, remote control for coding agents Minimal
Emdash Cloud-based coding workspace Limited Moderate
SoloTerm SSH-based terminal on mobile High

Paseo is a local-first app for monitoring and controlling coding agents remotely. A daemon runs on your machine. The mobile app talks to that. Your code does not need to move into a cloud workspace just so you can poke an agent from the couch. This was important. I did not want a separate mobile coding environment. I wanted a remote control for the environment I already have.

Paseo 是一款本地优先的应用,用于远程监控和控制编码智能体。一个守护进程在你的机器上运行,手机应用与之通信。你的代码不需要移动到云工作区,就能从沙发上操控智能体。这很重要。我不想要一个独立的移动端编码环境。我想要的是对我已有环境的远程控制。

Paseo was also open source, which turned out to be the difference between "this is close" and "I can probably make this work."

Paseo 也是开源的,而这正是"这个很接近"和"我大概能搞定"之间的区别。

Gaps in My Workflow: Where Paseo Fell Short

Paseo was already thoughtfully designed and looked great on mobile. It supported the core agents I cared about, but my OpenCode workflow exposed some rough edges.

Paseo 的设计已经很用心了,在手机上看起来很出色。它支持我关心的核心智能体,但我的 OpenCode 工作流暴露了一些粗糙之处。

The biggest one was session handoff. I would start a session in terminal OpenCode, then later want to pick it up in Paseo. Not a new agent. Not a fresh chat. The same session, with the same context. The details matter here. Who owns the session right now? What happens if the terminal and Paseo both try to drive it?

最大的问题是会话交接。我通常在终端 OpenCode 中启动一个会话,然后想在 Paseo 中继续它。不是新建一个智能体,不是一次新的对话,而是同一个会话,带着相同的上下文。这里的细节很关键:当前谁拥有这个会话?如果终端和 Paseo 都试图驱动它,会发生什么?

There were smaller papercuts too. Slash command autocomplete did not quite feel like OpenCode. Typing /q should obviously mean quit, but the app's ranking and visual ordering could make that feel wrong. Workspace defaults mattered too. If OpenCode already knows the right model or mode for a project, Paseo should not casually override that with its own preferences.

还有一些小的"纸面划伤"。斜杠命令的自动补全和 OpenCode 的感觉不太一样。输入 /q 显然应该表示退出,但应用的排序和视觉顺序可能会让它感觉不对。工作区默认设置也很重要。如果 OpenCode 已经知道项目正确的模型或模式,Paseo 不应该随意用自己的偏好覆盖它。

And then there were subagents. OpenCode can spawn subagents, but in Paseo those could look like generic long-running tool calls. From the phone, that read as "is this thing stuck?" even when the agent is doing exactly what it should be doing.

然后还有子智能体的问题。OpenCode 可以生成子智能体,但在 Paseo 中,这些看起来像是普通的长时间运行的工具调用。在手机上看到时,会让人心里犯嘀咕:"这个是不是卡住了?"——即使智能体正在完美地执行它该做的事情。

Problem Specific Description Why It Matters
Session Handoff Cannot resume an existing OpenCode session started outside Paseo Breaks flow between laptop and phone
Slash Command UX Autocomplete ranking doesn't match OpenCode's behavior (e.g., /q for quit) Creates friction in mobile typing
Workspace Defaults Paseo overrides OpenCode's model/mode settings Disrupts project-specific configuration
Subagent Opacity Subagents appear as generic tool calls, not identifiable tasks Causes confusion about whether agent is stuck

Gemini CLI support is also on my list, but that is probably a future rabbit hole.

Gemini CLI 支持也在我的清单上,但那可能是一个未来的兔子洞。

Fixing the Gaps with AI Agents

Turns out, if I can articulate a problem, agents can solve it. So I forked Paseo and used AI coding agents to improve Paseo's support for my needs.

结果发现,只要我能清楚地描述问题,智能体就能解决它。于是我 fork 了 Paseo,并使用 AI 编码智能体来改进 Paseo 对我需求的支持。

Most of the work was not me sitting down with a perfect design doc and typing code by hand. It was me steering with human language:

大部分工作不是我坐下来拿着一份完美的设计文档,然后手打代码。而是我用人话进行引导:

  • "I want to resume an OpenCode session that started outside Paseo."
  • "This resumed session looks empty, but I know there is history."
  • "Make /q behave more like terminal OpenCode."
  • "Subagents look like mysterious hangs. What's going on?"
  • "我想恢复一个在 Paseo 之外启动的 OpenCode 会话。"
  • "这个恢复的会话看起来是空的,但我知道里面有历史记录。"
  • "让 /q 的行为更像终端 OpenCode。"
  • "子智能体看起来像神秘的卡死。怎么回事?"

The agents investigated the codebase, made changes, wrote tests, and occasionally wandered into a wall like agents do. I still had to make decisions and review the results. I was less "implementing a feature" and more describing the workflow I wanted until the software caught up.

智能体们检查了代码库、做了修改、编写了测试,偶尔也会像智能体常做的那样撞上墙。我仍然需要做决策并审查结果。与其说我在"实现一个功能",不如说我在描述我想要的工作流程,直到软件跟上我的思路。

I fear professional developers will cringe at parts of my approach. But I also think I am part of an interesting archetype that software people should pay attention to: technical enough to prompt well, inspect results, and push through rough edges, but not necessarily approaching every problem like a traditional dev.

我担心专业开发者会对我的部分方法感到不适。但我也认为我是软件人员应该关注的一个有趣原型:技术能力足以写好提示、检查结果并克服粗糙之处,但不一定用传统开发者的方式处理每个问题。

That is not a replacement for engineering discipline. It is a new on-ramp to shaping software.

这不是对工程规范的替代。而是一条塑造软件的新入口。

Key Improvements: What I Changed

I think my fork now has a much better OpenCode handoff story. But I really only focused on three areas where the "mobile gap" was most painful to me:

我认为我的 fork 现在有了更好的 OpenCode 交接体验。但我实际只专注于三个对我而言"移动端缺口"最痛苦的领域:

Problem Specific Issue My Improvement
Session Handoff Cannot resume existing OpenCode sessions Discover active sessions; show preview of recent history for seamless laptop-to-phone transition
Slash Command UX Autocomplete ranking felt wrong Ranked commands so common intents like /q or /exit are always at top; respect workspace settings
Subagent Opacity Subagents looked like generic tool calls Subagents now report identity and current task status in the timeline; makes "hang" visible
  • Zero-Friction Handoffs: The app now discovers active OpenCode sessions in any workspace. I can resume a session and immediately see a preview of recent history, making the jump from laptop to phone feel seamless rather than like a context-switch.
  • 零摩擦交接: 应用现在可以发现在任何工作区中的活跃 OpenCode 会话。我可以恢复一个会话,并立即看到最近历史的预览,使从笔记本到手机的切换感觉无缝而不是像上下文切换。
  • Muscular Defaults: We tuned the UX to respect OpenCode's workspace settings and ranked slash commands so that common intents like /q or /exit are always at the top. The mobile keyboard is enough of a hurdle; the software shouldn't add to it.
  • 强大的默认设置: 我们调整了用户体验,使其尊重 OpenCode 的工作区设置,并对斜杠命令进行排序,使得 /q/exit 等常见意图始终位于顶部。手机键盘本身已经是一个障碍;软件不应该再增加难度。
  • Subagent Transparency: Instead of opaque "tool calls," subagents now report their identity and current task status directly in the timeline. If I can't see what a subagent is doing, I'm inclined to assume it's stuck. Surfacing that intent turns a mysterious "hang" into a visible, productive task.
  • 子智能体透明化: 子智能体现在直接在其时间线中报告其身份和当前任务状态,而不是模糊的"工具调用"。如果我看不到子智能体在做什么,我倾向于认为它卡住了。揭示这种意图将一个神秘的"卡死"转变为一个可见的、富有成效的任务。

Conclusion: Where This Is Going

I am really interested in personal software right now. I am contributing small changes back to Paseo's main branch when appropriate, but I would not be surprised if my fork keeps diverging. That is kind of the point. I am trying to make it work for me and my workflows.

我现在对个人软件非常感兴趣。我会在适当的时候将小的改动贡献回 Paseo 的主分支,但如果我的 fork 不断分化下去,我也不会感到惊讶。这就是重点。我正在努力使其适合我和我的工作流。

Gemini CLI integration is probably next. I am also interested in usage tracking across providers. I want Paseo to become a sort of home base for my available usage, active projects, and the agents I can point at them. A control plane for my agents and projects.

Gemini CLI 集成可能是下一步。我对跨提供商的用量追踪也很感兴趣。我希望 Paseo 成为我的可用资源、活跃项目以及我可以指向它们的智能体的一个"大本营"——成为我的智能体和项目的控制面板。

Open source plus coding agents made the gap between "this tool is close" and "this tool fits me" much smaller. Open source software has always invited tinkering. AI agents make the tinkering cheaper.

开源加上编码智能体,使得"这个工具很接近"和"这个工具适合我"之间的鸿沟大大缩小。开源软件一直欢迎修补,而 AI 智能体让修补变得更加廉价。

And now, apparently, some of that tinkering can happen from my phone. Which is either very cool or a sign I should touch grass.

而现在,显然,部分修补工作可以从我的手机上完成。这要么非常酷,要么说明我应该去户外走走。

常见问题(FAQ)

如何用手机远程控制编码代理?

Fork开源Paseo应用,在电脑运行守护进程,手机App连接后即可远程监控和操作编码代理,无需云服务。

Paseo相比其他工具的优势是什么?

Paseo是本地优先、开源、支持Linux和Android,能直接从终端切换会话,无需移动代码到云,适合手机远程编码。

如何优化编码代理的手机远程控制体验?

使用AI代理改善会话切换、默认设置和子代理透明度,确保手机端能无缝继续电脑端的编码会话。

← 返回文章列表
分享到:微博

版权与免责声明:本文仅用于信息分享与交流,不构成任何形式的法律、投资、医疗或其他专业建议,也不构成对任何结果的承诺或保证。

文中提及的商标、品牌、Logo、产品名称及相关图片/素材,其权利归各自合法权利人所有。本站内容可能基于公开资料整理,亦可能使用 AI 辅助生成或润色;我们尽力确保准确与合规,但不保证完整性、时效性与适用性,请读者自行甄别并以官方信息为准。

若本文内容或素材涉嫌侵权、隐私不当或存在错误,请相关权利人/当事人联系本站,我们将及时核实并采取删除、修正或下架等处理措施。 也请勿在评论或联系信息中提交身份证号、手机号、住址等个人敏感信息。