潜藏偏见：大型语言模型中显性与隐性政治刻板印象研究 (The Hidden Bias: A Study on Explicit and Implicit Political Stereotypes in Large Language Models)

Large Language Models (LLMs) are increasingly integral to information dissemination and decision-making processes. Given their growing societal influence, understanding potential biases, particularly within the political domain, is crucial to prevent undue influence on public opinion and democratic processes. This work investigates political bias and stereotype propagation across eight prominent LLMs using the two-dimensional Political Compass Test (PCT). Initially, the PCT is employed to assess the inherent political leanings of these models. Subsequently, persona prompting with the PCT is used to explore explicit stereotypes across various social dimensions. In a final step, implicit stereotypes are uncovered by evaluating models with multilingual versions of the PCT. Key findings reveal a consistent left-leaning political alignment across all investigated models. Furthermore, while the nature and extent of stereotypes vary considerably between models, implicit stereotypes elicited through language variation are more pronounced than those identified via explicit persona prompting. Interestingly, for most models, implicit and explicit stereotypes show a notable alignment, suggesting a degree of transparency or "awareness" regarding their inherent biases. This study underscores the complex interplay of political bias and stereotypes in LLMs.

翻译：大型语言模型（LLMs）在信息传播与决策过程中日益重要。鉴于其不断增长的社会影响力，理解其潜在偏见——特别是在政治领域——对于防止其对公众舆论和民主进程产生不当影响至关重要。本研究采用二维政治罗盘测试（PCT），对八个主流大型语言模型中的政治偏见与刻板印象传播进行了系统探究。首先，运用PCT评估这些模型固有的政治倾向。随后，通过PCT结合角色提示法，探究模型在不同社会维度上的显性刻板印象。最后，通过多语言版PCT评估模型，揭示其隐性刻板印象。关键发现表明：所有被调查模型均呈现出一致的左倾政治倾向。此外，尽管不同模型间刻板印象的性质与程度存在显著差异，但通过语言变异引发的隐性刻板印象比通过显性角色提示法识别的更为突出。值得注意的是，对于大多数模型，隐性刻板印象与显性刻板印象表现出显著的一致性，这暗示模型对其固有偏见存在某种程度的透明度或“认知”。本研究揭示了大型语言模型中政治偏见与刻板印象之间复杂的相互作用。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日