We often assume that robots which collaborate with humans should behave in ways that are transparent (e.g., legible, explainable). These transparent robots intentionally choose actions that convey their internal state to nearby humans: for instance, a transparent robot might exaggerate its trajectory to indicate its goal. But while transparent behavior seems beneficial for human-robot interaction, is it actually optimal? In this paper we consider collaborative settings where the human and robot have the same objective, and the human is uncertain about the robot's type (i.e., the robot's internal state). We extend a recursive combination of Bayesian Nash equilibrium and the Bellman equation to solve for optimal robot policies. Interestingly, we discover that it is not always optimal for collaborative robots to be transparent; instead, human and robot teams can sometimes achieve higher rewards when the robot is opaque. In contrast to transparent robots, opaque robots select actions that withhold information from the human. Our analysis suggests that opaque behavior becomes optimal when either (a) human-robot interactions have a short time horizon or (b) users are slow to learn from the robot's actions. We extend this theoretical analysis to user studies across 43 total participants in both online and in-person settings. We find that -- during short interactions -- users reach higher rewards when working with opaque partners, and subjectively rate opaque robots as about equal to transparent robots. See videos of our experiments here: https://youtu.be/u8q1Z7WHUuI
翻译:我们常假设与人类协作的机器人应表现出透明行为(例如,可读性、可解释性)。这些透明机器人有意选择能向附近人类传达其内部状态的动作:例如,透明机器人可能夸张地调整运动轨迹以表明其目标。然而,尽管透明行为看似有利于人机交互,它是否真的最优?本文研究了人类与机器人目标相同、且人类对机器人类型(即其内部状态)存在不确定性的协作场景。我们扩展了贝叶斯纳什均衡与贝尔曼方程的递归组合,以求解最优机器人策略。有趣的是,我们发现协作机器人并非总是保持透明才最优;相反,当机器人不透明时,人机团队有时能获得更高回报。与透明机器人不同,不透明机器人会选择向人类隐瞒信息的动作。我们的分析表明,当(a)人机交互时间跨度较短或(b)用户难以快速从机器人动作中学习时,不透明行为成为最优选择。我们将这一理论分析延伸至包含43名参与者的在线与线下用户研究。研究发现:在短期交互中,用户与不透明伙伴协作时获得更高回报,且主观评价不透明机器人与透明机器人水平相当。实验视频见:https://youtu.be/u8q1Z7WHUuI