Large language model (LLM)-based computer use agents execute user commands by interacting with available UI elements, but little is known about how users want to interact with these agents or what design factors matter for their user experience (UX). We conducted a two-phase study to map the UX design space for computer use agents. In Phase 1, we reviewed existing systems to develop a taxonomy of UX considerations, then refined it through interviews with eight UX and AI practitioners. The resulting taxonomy included categories such as user prompts, explainability, user control, and users' mental models, with corresponding subcategories and example design features. In Phase 2, we ran a Wizard-of-Oz study with 20 participants, where a researcher acted as a web-based computer use agent and probed user reactions during normal, error-prone and risky execution. We used the findings to validate the taxonomy from Phase 1 and deepen our understand of the design space by identifying the connections between design areas and divergence in user needs and scenarios. Our taxonomy and empirical insights provide a map for developers to consider different aspects of user experience in computer use agent design and to situate their designs within users' diverse needs and scenarios.
翻译:基于大语言模型(LLM)的计算机使用代理通过与可用UI元素交互来执行用户指令,但用户希望如何与这些代理交互,以及哪些设计因素对其用户体验(UX)至关重要,目前尚不明确。我们开展了一项两阶段研究,以映射计算机使用代理的UX设计空间。在第一阶段,我们回顾了现有系统,构建了一个UX考量因素分类法,并通过与八位UX和AI从业者的访谈对其进行了完善。所得分类法包含用户提示、可解释性、用户控制及用户心智模型等类别,并配有相应的子类别及示例设计特征。在第二阶段,我们进行了包含20名参与者的“绿野仙踪”式研究,其中一名研究人员扮演基于网络的计算机使用代理,并在正常、易出错及高风险执行过程中探究用户反应。我们利用研究结果验证了第一阶段的分类法,并通过识别设计领域之间的联系以及用户需求与场景的差异性,深化了对设计空间的理解。我们的分类法与实证见解为开发者提供了一幅地图,以考量计算机使用代理设计中用户体验的不同方面,并将其设计置于用户多样化的需求与场景之中。