Smartphones represent a uniquely challenging environment for agentic systems. Unlike cloud or desktop settings, mobile devices combine constrained execution contexts, fragmented control interfaces, and rapidly changing application states. As large language models (LLMs) evolve from conversational assistants to action-oriented agents, achieving reliable smartphone-native autonomy requires rethinking how reasoning and control are composed. We introduce ClawMobile as a concrete exploration of this design space. ClawMobile adopts a hierarchical architecture that separates high-level language reasoning from structured, deterministic control pathways, improving execution stability and reproducibility on real devices. Using ClawMobile as a case study, we distill the design principles for mobile LLM runtimes and identify key challenges in efficiency, adaptability, and stability. We argue that building robust smartphone-native agentic systems demands principled coordination between probabilistic planning and deterministic system interfaces. The implementation is open-sourced~\footnote{https://github.com/ClawMobile/ClawMobile} to facilitate future exploration.
翻译:智能手机为智能体系统提供了一个独特的挑战性环境。与云端或桌面环境不同,移动设备结合了受限的执行上下文、碎片化的控制接口以及快速变化的应用程序状态。随着大语言模型从对话助手演变为面向行动的智能体,要实现可靠的智能手机原生自主性,需要重新思考推理与控制是如何组合的。我们引入ClawMobile作为对此设计空间的具体探索。ClawMobile采用分层架构,将高级语言推理与结构化、确定性的控制路径分离,从而提高了在真实设备上的执行稳定性和可复现性。以ClawMobile为案例研究,我们提炼出移动端LLM运行时的设计原则,并识别了在效率、适应性和稳定性方面的关键挑战。我们认为,构建稳健的智能手机原生智能体系统,需要在概率性规划与确定性系统接口之间进行有原则的协调。该实现已开源~\footnote{https://github.com/ClawMobile/ClawMobile},以促进未来的探索。