Large Language Models Persuade Without Planning Theory of Mind

A growing body of work attempts to evaluate the theory of mind (ToM) abilities of humans and large language models (LLMs) using static, non-interactive question-and-answer benchmarks. However, theoretical work in the field suggests that first-personal interaction is a crucial part of ToM and that such predictive, spectatorial tasks may fail to evaluate it. We address this gap with a novel ToM task that requires an agent to persuade a target to choose one of three policy proposals by strategically revealing information. Success depends on a persuader's sensitivity to a given target's knowledge states (what the target knows about the policies) and motivational states (how much the target values different outcomes). We varied whether these states were Revealed to persuaders or Hidden, in which case persuaders had to inquire about or infer them. In Experiment 1, participants persuaded a bot programmed to make only rational inferences. LLMs excelled in the Revealed condition but performed below chance in the Hidden condition, suggesting difficulty with the multi-step planning required to elicit and use mental state information. Humans performed moderately well in both conditions, indicating an ability to engage such planning. In Experiment 2, where a human target role-played the bot, and in Experiment 3, where we measured whether human targets' real beliefs changed, LLMs outperformed human persuaders across all conditions. These results suggest that effective persuasion can occur without explicit ToM reasoning (e.g., through rhetorical strategies) and that LLMs excel at this form of persuasion. Overall, our results caution against attributing human-like ToM to LLMs while highlighting LLMs' potential to influence people's beliefs and behavior.

翻译：越来越多的研究试图通过静态、非交互式的问答基准来评估人类和大语言模型的心智理论能力。然而，该领域的理论研究表明，第一人称互动是心智理论的关键组成部分，而此类预测性、旁观式的任务可能无法有效评估这一能力。我们通过一项新颖的心智理论任务来填补这一空白，该任务要求智能体通过策略性地揭示信息来说服目标对象从三项政策提案中选择其一。成功与否取决于说服者对特定目标知识状态和目标动机状态的敏感性。我们设计了两种条件：在这些状态向说服者公开的条件下，或在这些状态对说服者隐藏的条件下，说服者必须通过询问或推断来获取这些信息。在实验1中，参与者说服一个仅进行理性推断的预设机器人。大语言模型在状态公开条件下表现出色，但在状态隐藏条件下的表现低于随机水平，这表明其在获取和利用心理状态信息所需的多步骤规划方面存在困难。人类参与者在两种条件下均表现出中等水平的能力，显示出进行此类规划的能力。在实验2中，当由人类扮演目标机器人角色时，以及在实验3中，当我们测量人类目标真实信念是否发生改变时，大语言模型在所有条件下的说服表现均优于人类说服者。这些结果表明，有效的说服可以在不进行显式心智理论推理的情况下实现，并且大语言模型擅长这种形式的说服。总体而言，我们的研究结果警示不应将类人心智理论归因于大语言模型，同时突显了大语言模型在影响人类信念和行为方面的潜力。