仅偏左一点点：基于理论的大语言模型政治偏见度量 (Only a Little to the Left: A Theory-grounded Measure of Political Bias in Large Language Models)

Prompt-based language models like GPT4 and LLaMa have been used for a wide variety of use cases such as simulating agents, searching for information, or for content analysis. For all of these applications and others, political biases in these models can affect their performance. Several researchers have attempted to study political bias in language models using evaluation suites based on surveys, such as the Political Compass Test (PCT), often finding a particular leaning favored by these models. However, there is some variation in the exact prompting techniques, leading to diverging findings and most research relies on constrained-answer settings to extract model responses. Moreover, the Political Compass Test is not a scientifically valid survey instrument. In this work, we contribute a political bias measured informed by political science theory, building on survey design principles to test a wide variety of input prompts, while taking into account prompt sensitivity. We then prompt 11 different open and commercial models, differentiating between instruction-tuned and non-instruction-tuned models, and automatically classify their political stances from 88,110 responses. Leveraging this dataset, we compute political bias profiles across different prompt variations and find that while PCT exaggerates bias in certain models like GPT3.5, measures of political bias are often unstable, but generally more left-leaning for instruction-tuned models.

翻译：基于提示的语言模型（如GPT4和LLaMa）已被广泛应用于模拟智能体、信息检索和内容分析等多种场景。对于这些应用及其他用途，模型中的政治偏见可能影响其性能。已有研究者尝试使用基于调查问卷的评估套件（如政治罗盘测试PCT）来研究语言模型的政治偏见，常发现这些模型存在特定倾向性。然而，由于具体提示技术的差异导致研究结论存在分歧，且多数研究依赖限制性答案设置来提取模型响应。此外，政治罗盘测试并非科学有效的调查工具。本研究基于政治学理论构建了一种政治偏见度量方法，依据调查设计原则测试了多种输入提示，同时考虑了提示敏感性。我们随后对11个不同的开源及商业模型进行提示测试，区分指令微调与非指令微调模型，并从88,110条响应中自动分类其政治立场。利用该数据集，我们计算了不同提示变体下的政治偏见特征，发现虽然PCT会夸大某些模型（如GPT3.5）的偏见程度，但政治偏见的度量结果往往不稳定；总体而言，指令微调模型普遍呈现更明显的左倾倾向。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/