Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs

Social intelligence and Theory of Mind (ToM), i.e., the ability to reason about the different mental states, intents, and reactions of all people involved, allow humans to effectively navigate and understand everyday social interactions. As NLP systems are used in increasingly complex social situations, their ability to grasp social dynamics becomes crucial. In this work, we examine the open question of social intelligence and Theory of Mind in modern NLP systems from an empirical and theory-based perspective. We show that one of today's largest language models (GPT-3; Brown et al., 2020) lacks this kind of social intelligence out-of-the box, using two tasks: SocialIQa (Sap et al., 2019), which measures models' ability to understand intents and reactions of participants of social interactions, and ToMi (Le et al., 2019), which measures whether models can infer mental states and realities of participants of situations. Our results show that models struggle substantially at these Theory of Mind tasks, with well-below-human accuracies of 55% and 60% on SocialIQa and ToMi, respectively. To conclude, we draw on theories from pragmatics to contextualize this shortcoming of large language models, by examining the limitations stemming from their data, neural architecture, and training paradigms. Challenging the prevalent narrative that only scale is needed, we posit that person-centric NLP approaches might be more effective towards neural Theory of Mind. In our updated version, we also analyze newer instruction tuned and RLFH models for neural ToM. We find that even ChatGPT and GPT-4 do not display emergent Theory of Mind; strikingly even GPT-4 performs only 60% accuracy on the ToMi questions related to mental states and realities.

翻译：社会智能与心智理论（Theory of Mind, ToM），即推理所有参与者不同心理状态、意图和反应的能力，使人类能够有效理解并处理日常社交互动。随着自然语言处理系统被应用于日益复杂的社会情境，其把握社会动态的能力变得至关重要。本研究从实证与理论视角，探讨现代自然语言处理系统中社会智能与心智理论这一开放性问题。我们通过两项任务证明，当前最大的语言模型之一（GPT-3；Brown等，2020）在开箱状态下缺乏此类社会智能：SocialIQa（Sap等，2019）衡量模型理解社交互动参与者意图与反应的能力，ToMi（Le等，2019）则评估模型推断情境参与者心理状态与现实性的能力。结果显示，模型在这些心智理论任务上表现严重不足，在SocialIQa和ToMi上的准确率分别仅为55%和60%，远低于人类水平。最后，我们借鉴语用学理论，通过分析大型语言模型在数据、神经架构及训练范式层面的局限性，对其缺陷进行语境化解读。针对当前"仅需扩展规模"的主流观点，我们提出以人为中心的自然语言处理方法可能更有助于实现神经心智理论。在更新版本中，我们还分析了近年经过指令微调与基于人类反馈的强化学习（RLFH）训练的模型在神经心智理论上的表现，发现即便是ChatGPT和GPT-4也未展现出涌现型心智理论——值得注意的是，GPT-4在涉及心理状态与现实性的ToMi问题上准确率也仅为60%。