What Questions Should Robots Be Able to Answer? A Dataset of User Questions for Explainable Robotics

With the growing use of large language models and conversational interfaces in human-robot interaction, robots' ability to answer user questions is more important than ever. We therefore introduce a dataset of 1,893 user questions for household robots, collected from 100 participants and organized into 12 categories and 70 subcategories. Most work in explainable robotics focuses on why-questions. In contrast, our dataset provides a wide variety of questions, from questions about simple execution details to questions about how the robot would act in hypothetical scenarios -- thus giving roboticists valuable insights into what questions their robot needs to be able to answer. To collect the dataset, we created 15 video stimuli and 7 text stimuli, depicting robots performing varied household tasks. We then asked participants on Prolific what questions they would want to ask the robot in each portrayed situation. In the final dataset, the most frequent categories are questions about task execution details (21.4%), the robot's capabilities (12.6%), and performance assessments (10.7%). Although questions about how robots would handle potentially difficult scenarios and ensure correct behavior are less frequent, users rank them as the most important for robots to be able to answer. Moreover, we find that users who identify as novices in robotics ask different questions than more experienced users. Novices are more likely to inquire about simple facts, such as what the robot did or the current state of the environment. As robots enter environments shared with humans and language becomes central to giving instructions and interaction, this dataset provides a valuable foundation for (i) identifying the information robots need to log and expose to conversational interfaces, (ii) benchmarking question-answering modules, and (iii) designing explanation strategies that align with user expectations.

翻译：随着大语言模型与对话式接口在人机交互中的广泛应用，机器人回答用户问题的能力比以往任何时候都更为重要。为此，我们构建了一个包含1,893个家庭机器人用户问题的数据集，这些问题从100名参与者中收集，并整理为12个大类与70个子类。现有可解释机器人学研究主要聚焦"为什么"类问题，而本数据集涵盖从简单执行细节到假设场景行为等多样化问题类型——为机器人学家提供了关于机器人应具备回答能力的关键洞察。为构建该数据集，我们制作了15个视频刺激材料与7个文本刺激材料，展示机器人执行多样化家务的场景，并通过Prolific平台邀请参与者针对每个场景描述向机器人提问。最终数据集显示，高频问题类别依次为：任务执行细节（21.4%）、机器人能力（12.6%）与性能评估（10.7%）。尽管用户较少询问机器人应对困难场景及确保正确行为的问题，但这类问题被用户评为最需要机器人回答的关键问题。此外，我们发现机器人领域的初学者与经验用户所提问题存在差异：初学者更倾向询问简单事实，如机器人已完成的操作或当前环境状态。随着机器人进入人类共享环境，语言成为指令下达与交互的核心载体，本数据集将为以下研究提供重要基础：（i）确定机器人需要记录并开放给对话式接口的信息；（ii）构建问答模块基准测试；（iii）设计与用户期望相符的解释策略。