Theory of Mind (ToM) is the ability to reason about one's own and others' mental states. ToM plays a critical role in the development of intelligence, language understanding, and cognitive processes. While previous work has primarily focused on first and second-order ToM, we explore higher-order ToM, which involves recursive reasoning on others' beliefs. We introduce HI-TOM, a Higher Order Theory of Mind benchmark. Our experimental evaluation using various Large Language Models (LLMs) indicates a decline in performance on higher-order ToM tasks, demonstrating the limitations of current LLMs. We conduct a thorough analysis of different failure cases of LLMs, and share our thoughts on the implications of our findings on the future of NLP.
翻译:心智理论(Theory of Mind, ToM)是指推理自身及他人心智状态的能力。ToM在智力发展、语言理解及认知过程中扮演关键角色。尽管先前研究主要关注一阶和二阶ToM,我们则探索涉及对他人信念进行递归推理的高阶ToM。我们提出了HI-TOM(高阶心智理论基准)。通过使用多种大型语言模型(LLMs)进行的实验评估表明,LLMs在高阶ToM任务上的表现呈下降趋势,凸显了当前LLMs的局限性。我们对LLMs的不同失败案例进行了深入分析,并就研究结果对未来自然语言处理(NLP)发展的启示分享了见解。