Understanding human perceptions of robot performance is crucial for designing socially intelligent robots that can adapt to human expectations. Current approaches often rely on surveys, which can disrupt ongoing human-robot interactions. As an alternative, we explore predicting people's perceptions of robot performance using non-verbal behavioral cues and machine learning techniques. We contribute the SEAN TOGETHER Dataset consisting of observations of an interaction between a person and a mobile robot in Virtual Reality, together with perceptions of robot performance provided by users on a 5-point scale. We then analyze how well humans and supervised learning techniques can predict perceived robot performance based on different observation types (like facial expression and spatial behavior features). Our results suggest that facial expressions alone provide useful information, but in the navigation scenarios that we considered, reasoning about spatial features in context is critical for the prediction task. Also, supervised learning techniques outperformed humans' predictions in most cases. Further, when predicting robot performance as a binary classification task on unseen users' data, the F1-Score of machine learning models more than doubled that of predictions on a 5-point scale. This suggested good generalization capabilities, particularly in identifying performance directionality over exact ratings. Based on these findings, we conducted a real-world demonstration where a mobile robot uses a machine learning model to predict how a human who follows it perceives it. Finally, we discuss the implications of our results for implementing these supervised learning models in real-world navigation. Our work paves the path to automatically enhancing robot behavior based on observations of users and inferences about their perceptions of a robot.
翻译:理解人类对机器人性能的感知对于设计能够适应人类期望的社交智能机器人至关重要。当前方法通常依赖问卷调查,但这可能干扰正在进行的人机交互。作为替代方案,我们探索利用非语言行为线索和机器学习技术预测人们对机器人性能的感知。我们贡献了SEAN TOGETHER数据集,该数据集包含对虚拟现实中人与移动机器人交互的观察记录,以及用户以5点量表提供的机器人性能感知评分。随后,我们分析了人类和监督学习技术基于不同观察类型(如面部表情和空间行为特征)预测感知机器人性能的能力。结果表明,仅面部表情即可提供有用信息,但在我们所考虑的导航场景中,结合上下文推理空间特征对于预测任务至关重要。此外,监督学习技术在多数情况下优于人类预测。进一步地,当以二元分类任务预测未见用户数据中的机器人性能时,机器学习模型的F1分数较5点量表的预测提升超过一倍,这表明模型具有良好的泛化能力,尤其在识别性能方向性而非精确评分方面。基于这些发现,我们进行了真实场景演示:移动机器人使用机器学习模型预测跟随者对其的感知。最后,我们讨论了将监督学习模型应用于实际导航的潜在影响。本研究为通过观察用户行为并推断其对机器人的感知来自动优化机器人行为奠定了基础。