Today's robots often assume that their behavior should be transparent. These transparent (e.g., legible, explainable) robots intentionally choose actions that convey their internal state to nearby humans. But while transparent behavior seems beneficial, is it actually optimal? In this paper we consider collaborative settings where the human and robot have the same objective, and the human is uncertain about the robot's type (i.e., the robot's internal state). We extend a recursive combination of Bayesian Nash equilibrium and the Bellman equation to solve for optimal robot policies. Interestingly, we discover that it is not always optimal for collaborative robots to be transparent; instead, human and robot teams can sometimes achieve higher rewards when the robot is opaque. Opaque robots select the same actions regardless of their internal state: because each type of opaque robot behaves in the same way, the human cannot infer the robot's type. Our analysis suggests that opaque behavior becomes optimal when either (a) human-robot interactions have a short time horizon or (b) users are slow to learn from the robot's actions. Across online and in-person user studies with 43 total participants, we find that users reach higher rewards when working with opaque partners, and subjectively rate opaque robots as about equal to transparent robots. See videos of our experiments here: https://youtu.be/u8q1Z7WHUuI
翻译:当今的机器人常假设其行为应当透明。这些透明(例如可解读、可解释)的机器人有意选择能向附近人类传达其内部状态的动作。然而,透明行为固然看似有益,但是否真的最优?本文考虑人类与机器人目标相同且人类对机器人类型(即机器人内部状态)存在不确定性的协作场景。我们扩展了贝叶斯纳什均衡与贝尔曼方程的递归组合,用以求解最优机器人策略。有趣的是,我们发现协作机器人并非总是最优选择透明策略;相反,当机器人保持不透明时,人机团队有时能获得更高回报。不透明机器人不论内部状态如何均选择相同动作:由于每种类型的不透明机器人行为一致,人类无法推断机器人的类型。我们的分析表明,当以下任一条件成立时,不透明行为成为最优策略:(a)人机交互时间跨度较短,或(b)用户难以从机器人动作中快速学习。通过总计43名参与者的线上与线下用户研究,我们发现与不透明伙伴协作时用户能获得更高回报,且主观上对不透明机器人的评价与透明机器人相当。实验视频见:https://youtu.be/u8q1Z7WHUuI