While GUI agents have shown strong performance under explicit and completion instructions, real-world deployment requires aligning with users' more complex implicit intents. In this work, we highlight Hierarchical Implicit Intent Alignment for Personalized GUI Agent (PersonalAlign), a new agent task that requires agents to leverage long-term user records as persistent context to resolve omitted preferences in vague instructions and anticipate latent routines by user state for proactive assistance. To facilitate this study, we introduce AndroidIntent, a benchmark designed to evaluate agents' ability in resolving vague instructions and providing proactive suggestions through reasoning over long-term user records. We annotated 775 user-specific preferences and 215 routines from 20k long-term records across different users for evaluation. Furthermore, we introduce Hierarchical Intent Memory Agent (HIM-Agent), which maintains a continuously updating personal memory and hierarchically organizes user preferences and routines for personalization. Finally, we evaluate a range of GUI agents on AndroidIntent, including GPT-5, Qwen3-VL, and UI-TARS, further results show that HIM-Agent significantly improves both execution and proactive performance by 15.7% and 7.3%.
翻译:尽管图形用户界面代理在显式和完成式指令下表现出色,但实际部署需要与用户更复杂的隐式意图对齐。本研究聚焦于个性化GUI代理的分层隐式意图对齐任务,该任务要求代理利用长期用户记录作为持久上下文,以解析模糊指令中被省略的偏好,并根据用户状态预测潜在操作习惯以提供主动协助。为推进该研究,我们提出了AndroidIntent基准测试,旨在评估代理通过长期用户记录推理来解析模糊指令和提供主动建议的能力。我们从不同用户的2万条长期记录中标注了775项用户特定偏好和215项操作习惯用于评估。此外,我们提出了分层意图记忆代理,该代理维护持续更新的个人记忆,并分层组织用户偏好和操作习惯以实现个性化。最后,我们在AndroidIntent上评估了包括GPT-5、Qwen3-VL和UI-TARS在内的一系列GUI代理,进一步结果表明分层意图记忆代理将执行性能和主动性能分别显著提升了15.7%和7.3%。