社交谄媚量表：一种经心理测量学验证的谄媚度量方法 (The Social Sycophancy Scale: A psychometrically validated measure of sycophancy)

from arxiv, 35 pages, 1 figure, 5 tables. For supplementary material, see https://osf.io/r8gys/ Author Contributions: J.R and M.I conceived the study design and research questions. J.R and D.O programmed the experimental iterations, collected, and cleaned the data. J.R and V.O.M analyzed the data. J.R wrote the manuscript. All authors edited the manuscript and provided oversight of and feedback on the work

Large Language Model (LLM) sycophancy is a growing concern. The current literature has largely examined sycophancy in contexts with clear right and wrong answers, like coding. However, AI is increasingly being used for emotional support and interpersonal conversation, where no such ground truth exists. Building on a previous conceptualization of Social Sycophancy, this paper provides a psychometrically validated measure of sycophancy that relies on LLM behavior rather than comparisons with ground truth. We developed and validated the Social Sycophancy Scale in three samples (N = 877) and tested its applicability with automated methods. In each study, participants read conversations between an LLM and a user and rated the chatbot on a battery of items. Study 1 investigated an initial item pool derived from dictionary definitions and previous literature, serving as the explorative base for the following studies. In Study 2, we used a revised item set to establish our scale, which was subsequently confirmed in Study 3 and tested using LLM raters in Study 4. Across studies, the data support a 3 factor structure (Uncritical Agreement, Obsequiousness, and Excitement) with an underlying sycophantic construct. LLMs prompt tuned to be highly sycophantic scored higher than their low sycophancy counterparts on both overall sycophancy and its three facets across Studies 2 to 4. The nomological network of sycophancy revealed a consistent link with empathy, a pairing that raises uncomfortable questions about AI design, and a multivalent pattern: one facet was associated with favorable perceptions (Excitement), another unfavorable (Obsequiousness), and a third ambiguous (Uncritical Agreement). The Social Sycophancy Scale gives researchers the means to study sycophancy rigorously, and confront a genuine design tension: the warmth and empathy we want from AI may be precisely what makes it sycophantic.

翻译：大型语言模型（LLM）的谄媚问题日益受到关注。现有文献主要在具有明确对错答案的语境（如编程）中考察谄媚行为。然而，人工智能正越来越多地用于情感支持和人际对话，而这些领域并不存在此类基本事实。基于先前对社交谄媚的概念化，本文提供了一种经心理测量学验证的谄媚度量方法，该方法依赖于LLM的行为而非与基本事实的比较。我们在三个样本（N = 877）中开发并验证了社交谄媚量表，并通过自动化方法测试了其适用性。在每项研究中，参与者阅读LLM与用户之间的对话，并在一系列项目上对聊天机器人进行评分。研究1调查了源自词典定义和先前文献的初始项目池，为后续研究提供了探索性基础。在研究2中，我们使用修订后的项目集建立了我们的量表，该量表随后在研究3中得到确认，并在研究4中使用LLM评分者进行了测试。所有研究的数据均支持一个包含三个因子（无批判性赞同、阿谀奉承和过度兴奋）的潜在谄媚结构。在研究2至4中，经过提示调优以高度谄媚的LLM在整体谄媚度及其三个维度上的得分均高于低谄媚度的对应模型。谄媚的法则网络揭示了其与共情存在稳定关联，这种配对关系对人工智能设计提出了令人不安的问题，并呈现出一种多价模式：一个维度与积极感知相关（过度兴奋），另一个与消极感知相关（阿谀奉承），第三个则具有模糊性（无批判性赞同）。社交谄媚量表为研究者提供了严谨研究谄媚现象的工具，并揭示了一个真实的设计困境：我们期望从人工智能获得的温暖和共情，可能恰恰是其产生谄媚行为的根源。