Large Language Model (LLM)-powered web GUI agents are increasingly automating everyday online tasks. Despite their popularity, little is known about how users' preferences and values impact agents' reasoning and behavior. In this work, we investigate how both explicit and implicit user preferences, as well as the underlying user values, influence agent decision-making and action trajectories. We built a controlled testbed of 14 common interactive web tasks, spanning shopping, travel, dining, and housing, each replicated from real websites and integrated with a low-fidelity LLM-based recommender system. We injected 12 human preferences and values as personas into four state-of-the-art agents and systematically analyzed their task behaviors. Our results show that preference and value-infused prompts consistently guided agents toward outcomes that reflected these preferences and values. While the absence of user preference or value guidance led agents to exhibit a strong efficiency bias and employ shortest-path strategies, their presence steered agents' behavior trajectories through the greater use of corresponding filters and interactive web features. Despite their influence, dominant interface cues, such as discounts and advertisements, frequently overrode these effects, shortening the agents' action trajectories and inducing rationalizations that masked rather than reflected value-consistent reasoning. The contributions of this paper are twofold: (1) an open-source testbed for studying the influence of values in agent behaviors, and (2) an empirical investigation of how user preferences and values shape web agent behaviors.
翻译:基于大语言模型(LLM)的网页图形用户界面(GUI)代理正日益自动化日常在线任务。尽管其应用日益广泛,但用户偏好与价值观如何影响代理的推理与行为,目前尚缺乏深入研究。本文探究了显性与隐性用户偏好以及底层用户价值观如何影响代理的决策制定与行动轨迹。我们构建了一个包含14项常见交互式网页任务的受控测试平台,涵盖购物、旅行、餐饮与住房领域,每个任务均从真实网站复现,并与一个低保真度的基于LLM的推荐系统集成。我们将12种人类偏好与价值观以“角色设定”形式注入四种先进代理中,并系统分析了它们的任务行为。研究结果表明,注入偏好与价值观的提示词能持续引导代理产生反映这些偏好与价值观的结果。当缺乏用户偏好或价值观引导时,代理表现出强烈的效率偏好并采用最短路径策略;而引入此类引导后,代理通过更多地使用相应筛选器与交互式网页功能,改变了其行为轨迹。然而,尽管存在上述影响,主导性的界面线索(如折扣与广告)常会覆盖这些效应,缩短代理的行动轨迹,并引发合理化解释——这些解释往往掩盖而非反映与价值观一致的推理过程。本文的贡献包括:(1)一个用于研究价值观如何影响代理行为的开源测试平台;(2)对用户偏好与价值观如何塑造网页代理行为的实证研究。