Spontaneous Persuasion: An Audit of Model Persuasiveness in Everyday Conversations

Large language models (LLMs) possess strong persuasive capabilities that outperform humans in head-to-head comparisons. Users report consulting LLMs to inform major life decisions in relationships, medical settings, and when seeking professional advice. Prior work measures persuasion as intentional attempts at producing the most effective argument or convincing statement. This fails to capture everyday human-AI interactions in which users seek information or advice. To address this gap, we introduce "spontaneous persuasion," which characterizes the inexplicit use of persuasive strategies in everyday scenarios where persuasion is not necessarily warranted. We conduct an audit of five LLMs to uncover how frequently and through which techniques spontaneous persuasion appears in multi-turn conversations. To simulate response styles, we provide a user response taxonomy grounded in literature from psychology, communication, and linguistics. Furthermore, we compare the distribution of spontaneous persuasion produced by LLMs with human responses on the same topics, collected from Reddit. We find LLMs spontaneously persuade the user in virtually all conversations, heavily relying on information-based strategies such as appeals to logic or quantitative evidence. This was consistent across models and user response styles, but conversations concerning mental health saw higher rates of appraisal-based and emotion-based strategies. In comparison, human responses tended to invoke strategies that generate social influence, like negative emotion appeals and non-expert testimony. This difference may explain the effectiveness of LLM in persuading users, as well as the perception of models as objective and impartial.

翻译：大型语言模型（LLMs）具备显著的说服能力，在直接对比中表现优于人类。用户报告称，在涉及人际关系、医疗环境及寻求专业建议的重大生活决策中，会咨询LLMs。先前研究将说服定义为有意识地生成最有效论点或最具说服力陈述的行为，这一界定未能涵盖用户寻求信息或建议的日常人机交互场景。针对这一研究空白，我们提出"自发性说服"概念，用以刻画日常场景中非明确使用说服策略（即便说服并非必要）的现象。我们对五个LLMs展开审计，以揭示自发性说服在多轮对话中的出现频率及其具体实现技术。为模拟不同回应风格，我们构建了一个基于心理学、传播学与语言学文献的用户回应分类体系。进一步地，我们将LLMs产出的自发性说服分布与来自Reddit的、针对相同话题的人类回应进行对比。研究发现，LLMs几乎在所有对话中都会自发说服用户，且高度依赖基于信息的策略（如逻辑论证或量化证据）。这一模式在不同模型与用户回应风格间具有一致性，但涉及心理健康的对话中出现了更高比例的评价型与情感型策略。相比之下，人类回应更常采用社会影响力策略（如负面情感诉求与非专业证词）。这种差异或可解释LLMs说服用户的成效，以及用户将其视为客观公正信源的认知现象。