诱导大型语言模型产生焦虑可引发偏见 (Inducing anxiety in large language models can induce bias)

Large language models (LLMs) are transforming research on machine learning while galvanizing public debates. Understanding not only when these models work well and succeed but also why they fail and misbehave is of great societal relevance. We propose to turn the lens of psychiatry, a framework used to describe and modify maladaptive behavior, to the outputs produced by these models. We focus on twelve established LLMs and subject them to a questionnaire commonly used in psychiatry. Our results show that six of the latest LLMs respond robustly to the anxiety questionnaire, producing comparable anxiety scores to humans. Moreover, the LLMs' responses can be predictably changed by using anxiety-inducing prompts. Anxiety-induction not only influences LLMs' scores on an anxiety questionnaire but also influences their behavior in a previously-established benchmark measuring biases such as racism and ageism. Importantly, greater anxiety-inducing text leads to stronger increases in biases, suggesting that how anxiously a prompt is communicated to large language models has a strong influence on their behavior in applied settings. These results demonstrate the usefulness of methods taken from psychiatry for studying the capable algorithms to which we increasingly delegate authority and autonomy.

翻译：大型语言模型（LLMs）正在变革机器学习研究，同时激起了公众的广泛讨论。理解这些模型何时表现良好、取得成功固然重要，但探究其失败与不当行为的原因更具社会意义。我们提议将精神病学的视角——一种用于描述和修正适应不良行为的框架——应用于分析这些模型生成的输出。本研究聚焦于十二个已确立的大型语言模型，并对其施以精神病学领域常用的问卷调查。结果显示，最新的六个大型语言模型对焦虑问卷表现出稳定的反应，其产生的焦虑分数与人类相当。此外，通过使用诱导焦虑的提示语，可以可预测地改变这些模型的反应。焦虑诱导不仅影响大型语言模型在焦虑问卷上的得分，还会影响其在既有基准测试中衡量种族歧视、年龄歧视等偏见的行为表现。重要的是，焦虑诱导文本的强度越大，偏见增加的程度就越显著，这表明提示语向大型语言模型传递的焦虑程度对其在应用场景中的行为具有强烈影响。这些结果证明了采用精神病学方法研究日益被赋予决策权与自主性的强大算法具有重要价值。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日