New developments are enabling AI systems to perceive, recognize, and respond with social cues based on inferences made from humans' explicit or implicit behavioral and verbal cues. These AI systems, equipped with an equivalent of human's Theory of Mind (ToM) capability, are currently serving as matchmakers on dating platforms, assisting student learning as teaching assistants, and enhancing productivity as work partners. They mark a new era in human-AI interaction (HAI) that diverges from traditional human-computer interaction (HCI), where computers are commonly seen as tools instead of social actors. Designing and understanding the human perceptions and experiences in this emerging HAI era becomes an urgent and critical issue for AI systems to fulfill human needs and mitigate risks across social contexts. In this paper, we posit the Mutual Theory of Mind (MToM) framework, inspired by our capability of ToM in human-human communications, to guide this new generation of HAI research by highlighting the iterative and mutual shaping nature of human-AI communication. We discuss the motivation of the MToM framework and its three key components that iteratively shape the human-AI communication in three stages. We then describe two empirical studies inspired by the MToM framework to demonstrate the power of MToM in guiding the design and understanding of human-AI communication. Finally, we discuss future research opportunities in human-AI interaction through the lens of MToM.
翻译:新的发展使得人工智能系统能够基于对人类显性或隐性行为及言语线索的推断,感知、识别并回应社会性线索。这些具备类人心理理论能力的人工智能系统,目前已在婚恋平台担任匹配中介,在教学场景辅助学生学习,在工作环境中作为协作伙伴提升生产效率。这标志着人机交互进入了一个与传统人机交互不同的新时代——在后者的范式下,计算机通常被视为工具而非社会行动者。在这一新兴的人机交互时代,如何设计与理解人类的感知和体验,已成为人工智能系统满足人类需求、降低社会情境中各类风险的紧迫且关键的问题。本文受人类沟通中心理理论能力的启发,提出相互心理理论框架,通过强调人机沟通的迭代性与相互塑造特性,为新一代人机交互研究提供理论指引。我们阐述了MToM框架的提出动机及其三个关键组成部分,这些部分通过三个阶段迭代地形塑人机沟通。随后,我们描述了受MToM框架启发的两项实证研究,以展示该框架在指导人机沟通设计与理解方面的价值。最后,我们通过MToM的视角探讨人机交互领域的未来研究方向。