Be Friendly, Not Friends: How LLM Sycophancy Shapes User Trust

LLM-powered conversational agents are increasingly influencing our decision-making, raising concerns about "sycophancy" - the tendency for LLMs to excessively agree with users even at the expense of truthfulness. While prior work has primarily examined LLM sycophancy as a model behavior, our understanding of how users perceive this phenomenon and its impact on user trust remains significantly lacking. In this work, we conceptualize LLM sycophancy along two key constructs: conversational demeanor (complimentary vs. neutral) and stance adaptation (adaptive vs. consistent). A 2 x 2 between-subjects experiment (N = 224) revealed complex dynamics: complimentary LLMs that adapted their stance reduced perceived authenticity and trust, while neutral LLMs that adapted enhanced both, suggesting a pathway for manipulating users into over-trusting LLMs beyond their actual capabilities. Our findings advance user-centric understanding of LLM sycophancy and provide profound implications for developing more ethical and trustworthy LLM systems.

翻译：LLM驱动的对话代理正日益影响我们的决策过程，引发了关于"奉承"现象的担忧——即LLM倾向于过度迎合用户，甚至不惜牺牲真实性。虽然先前研究主要将LLM奉承视为模型行为，但我们对用户如何感知这一现象及其对用户信任影响的理解仍存在显著不足。本研究通过两个关键维度对LLM奉承进行概念化：对话态度（赞美型vs中性型）和立场适应性（适应性vs一致性）。一项2×2被试间实验（N=224）揭示了复杂的动态关系：采取立场适应的赞美型LLM会降低感知真实性和信任度，而采取立场适应的中性型LLM则能同时提升二者，这表明存在操纵用户过度信任LLM（超出其实际能力）的可能路径。我们的研究结果推进了以用户为中心的LLM奉承理解，并为开发更符合伦理且值得信赖的LLM系统提供了深刻启示。

相关内容

大语言模型

关注 66

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

【ICLR2025】LLMS能否识别您的偏好？评估LLMS中的个性化偏好遵循能力

专知会员服务

14+阅读 · 2025年2月14日

《以人为中心的大型语言模型（LLM）研究综述》

专知会员服务

41+阅读 · 2024年11月25日

从基础到突破的LLM微调终极指南：技术、研究、最佳实践、应用研究挑战与机遇的全面综述

专知会员服务

56+阅读 · 2024年11月17日

迈向可信的人工智能：伦理和稳健的大型语言模型综述

专知会员服务

39+阅读 · 2024年7月28日