Modern task-oriented chatbots present GUI elements alongside natural-language dialogue, yet the agent's role has largely been limited to interpreting natural-language input as GUI actions and following a linear workflow. In preference-driven, multi-step tasks such as booking a flight or reserving a restaurant, earlier choices constrain later options and may force users to restart from scratch. User preferences serve as the key criteria for these decisions, yet existing agents do not systematically leverage them. We present MAESTRO, which extends the agent's role from execution to decision support. MAESTRO maintains a shared preference memory that extracts preferences from natural-language utterances with their strength, and provides two mechanisms. Preference-Grounded GUI Adaptation applies in-place operators (augment, sort, filter, and highlight) to the existing GUI according to preference strength, supporting within-stage comparison. Preference-Guided Workflow Navigation detects conflicts between preferences and available options, proposes backtracking, and records failed paths to avoid revisiting dead ends. We evaluated MAESTRO in a movie-booking Conversational Agent with GUI (CAG) through a within-subjects study with two conditions (Baseline vs. MAESTRO) and two modes (Text vs. Voice), with N = 33 participants.
翻译:现代任务导向型聊天机器人会在自然语言对话中呈现图形用户界面(GUI)元素,但代理的角色主要局限于将自然语言输入解析为GUI操作并遵循线性工作流。在偏好驱动的多步骤任务中(如预订航班或餐厅),早期选择会限制后续选项,并可能迫使用户从头开始。用户偏好是这些决策的关键标准,但现有代理并未系统性地利用它们。我们提出MAESTRO,将代理角色从执行扩展至决策支持。MAESTRO维护一个共享的偏好记忆模块,从自然语言话语中提取偏好及其强度,并提供两种机制:偏好引导的GUI适配通过原地操作(增强、排序、过滤和高亮)根据偏好强度调整现有GUI,支持阶段内比较;偏好引导的工作流导航检测偏好与可用选项之间的冲突,提出回溯方案并记录失败路径以避免重复搜索死胡同。我们在一个电影预订的图形用户界面对话代理(CAG)中,通过N=33名参与者的被试内研究(基线 vs. MAESTRO两种条件,文本 vs.语音两种模式)对MAESTRO进行了评估。