Human impressions of robot performance are often measured through surveys. As a more scalable and cost-effective alternative, we investigate the possibility of predicting people's impressions of robot behavior using non-verbal behavioral cues and machine learning techniques. To this end, we first contribute the SEAN TOGETHER Dataset consisting of observations of an interaction between a person and a mobile robot in a VR simulation, together with impressions of robot performance provided by users on a 5-point scale. Second, we contribute analyses of how well humans and supervised learning techniques can predict perceived robot performance based on different observation types (like facial expression features, and features that describe the navigation behavior of the robot and pedestrians). Our results suggest that facial expressions alone provide useful information about human impressions of robot performance; but in the navigation scenarios that we considered, reasoning about spatial features in context is critical for the prediction task. Also, supervised learning techniques showed promise because they outperformed humans' predictions of robot performance in most cases. Further, when predicting robot performance as a binary classification task on unseen users' data, the F1 Score of machine learning models more than doubled in comparison to predicting performance on a 5-point scale. This suggested that the models can have good generalization capabilities, although they are better at telling the directionality of robot performance than predicting exact performance ratings. Based on our findings in simulation, we conducted a real-world demonstration in which a mobile robot uses a machine learning model to predict how a human that follows it perceives it. Finally, we discuss the implications of our results for implementing such supervised learning models in real-world navigation scenarios.
翻译:人类对机器人表现的印象通常通过问卷调查来衡量。作为一种更具可扩展性和成本效益的替代方案,我们研究了利用非语言行为线索和机器学习技术来预测人们对机器人行为印象的可能性。为此,我们首先贡献了SEAN TOGETHER数据集,该数据集包含对虚拟现实模拟中人与移动机器人交互的观察记录,以及用户以5点量表提供的对机器人表现的印象评分。其次,我们分析了人类和监督学习技术基于不同观察类型(如面部表情特征,以及描述机器人与行人导航行为的特征)预测感知到的机器人表现的能力。我们的结果表明,仅凭面部表情就能提供关于人类对机器人表现印象的有用信息;但在我们所考虑的导航场景中,结合上下文进行空间特征推理对于预测任务至关重要。此外,监督学习技术显示出良好的前景,因为在大多数情况下,其预测效果优于人类的预测。进一步地,当将机器人表现预测作为未见过的用户数据的二元分类任务时,机器学习模型的F1分数相较于在5点量表上的预测提高了一倍以上。这表明模型具有良好的泛化能力,尽管它们更擅长判断机器人表现的方向性,而非预测精确的评分等级。基于我们在模拟中的发现,我们进行了一项真实世界演示,其中移动机器人使用机器学习模型来预测跟随它的人类对其的感知。最后,我们讨论了研究结果在现实世界导航场景中实施此类监督学习模型的潜在意义。