As digital environments (data distribution) are in flux, with new GUI data arriving over time-introducing new domains or resolutions-agents trained on static environments deteriorate in performance. In this work, we introduce Continual GUI Agents, a new task that requires GUI agents to perform continual learning under shifted domains and resolutions. We find existing methods fail to maintain stable grounding as GUI distributions shift over time, due to the diversity of UI interaction points and regions in fluxing scenarios. To address this, we introduce GUI-Anchoring in Flux (GUI-AiF), a new reinforcement fine-tuning framework that stabilizes continual learning through two novel rewards: Anchoring Point Reward in Flux (APR-iF) and Anchoring Region Reward in Flux (ARR-iF). These rewards guide the agents to align with shifting interaction points and regions, mitigating the tendency of existing reward strategies to over-adapt to static grounding cues (e.g., fixed coordinates or element scales). Extensive experiments show GUI-AiF surpasses state-of-the-art baselines. Our work establishes the first continual learning framework for GUI agents, revealing the untapped potential of reinforcement fine-tuning for continual GUI Agents.
翻译:随着数字环境(数据分布)不断变化,新的图形用户界面数据随时间不断涌现——引入新的领域或分辨率——在静态环境中训练的智能体性能会逐渐下降。在本研究中,我们提出了持续图形用户界面智能体这一新任务,要求图形用户界面智能体在变化的领域和分辨率下进行持续学习。我们发现,由于用户界面交互点和区域在动态场景中的多样性,现有方法无法在图形用户界面分布随时间变化时保持稳定的基础定位。为解决这一问题,我们提出了动态锚定图形用户界面框架,这是一种新的强化微调框架,通过两种新颖的奖励机制来稳定持续学习:动态锚定点奖励和动态锚定区域奖励。这些奖励引导智能体与变化的交互点和区域对齐,缓解了现有奖励策略过度适应静态基础定位线索(例如固定坐标或元素比例)的倾向。大量实验表明,动态锚定图形用户界面框架超越了现有最先进的基线方法。我们的研究建立了首个针对图形用户界面智能体的持续学习框架,揭示了强化微调在持续图形用户界面智能体领域尚未开发的潜力。