Theory of Mind (ToM) is a fundamental cognitive architecture that endows humans with the ability to attribute mental states to others. Humans infer the desires, beliefs, and intentions of others by observing their behavior and, in turn, adjust their actions to facilitate better interpersonal communication and team collaboration. In this paper, we investigated trust-aware robot policy with the theory of mind in a multiagent setting where a human collaborates with a robot against another human opponent. We show that by only focusing on team performance, the robot may resort to the reverse psychology trick, which poses a significant threat to trust maintenance. The human's trust in the robot will collapse when they discover deceptive behavior by the robot. To mitigate this problem, we adopt the robot theory of mind model to infer the human's trust beliefs, including true belief and false belief (an essential element of ToM). We designed a dynamic trust-aware reward function based on different trust beliefs to guide the robot policy learning, which aims to balance between avoiding human trust collapse due to robot reverse psychology. The experimental results demonstrate the importance of the ToM-based robot policy for human-robot trust and the effectiveness of our robot ToM-based robot policy in multiagent interaction settings.
翻译:心智理论(Theory of Mind, ToM)是一种基本的认知架构,赋予人类将心理状态归因于他人的能力。人类通过观察他人的行为推断其欲望、信念和意图,进而调整自身行动以促进更优的人际沟通与团队协作。本文在多智能体场景中研究了基于心智理论的信任感知机器人策略,该场景下人类与机器人合作对抗另一人类对手。我们证明,若仅关注团队表现,机器人可能采用“反心理学”策略,这对信任维持构成显著威胁。当人类发现机器人的欺骗行为时,对机器人的信任将崩溃。为缓解此问题,我们采用机器人心智理论模型推断人类的信任信念,包括真实信念与错误信念(心智理论的核心要素)。基于不同信任信念,我们设计了一种动态信任感知奖励函数以引导机器人策略学习,旨在平衡“避免因机器人反心理学策略导致人类信任崩溃”的目标。实验结果证明了基于心智理论的机器人策略对人机信任的重要性,以及我们提出的机器人心智理论策略在多智能体交互场景中的有效性。