Large Language Models (LLMs) have become foundational in modern language-driven software applications, profoundly influencing daily life. A critical technique in leveraging their potential is role-playing, where LLMs simulate diverse roles to enhance their real-world utility. However, while research has highlighted the presence of social biases in LLM outputs, it remains unclear whether and to what extent these biases emerge during role-playing scenarios. In this paper, we conduct an empirical study on fairness testing of LLMs in role-playing scenarios. To enable this testing, we use LLMs to generate 550 social roles spanning a comprehensive set of 11 demographic attributes, producing 33,000 role-specific questions that target various forms of bias. These questions, covering Yes/No, multiple-choice, and open-ended formats, are designed to prompt LLMs to adopt specific roles and respond accordingly. We employ a combination of rule-based and LLM-based strategies to identify biased responses, rigorously validated through human evaluation. Using the generated questions as the test cases, we conduct extensive evaluations of 10 advanced LLMs. The evaluation reveal 107,580 biased responses across the studied LLMs, with individual models yielding between 7,579 and 16,963 biased responses, underscoring the prevalence of bias in role-playing contexts. To support future research, we have publicly released the dataset, along with all scripts and experimental results.
翻译:大语言模型(LLM)已成为现代语言驱动型软件应用的基础,深刻影响着日常生活。发挥其潜力的关键技巧之一是角色扮演——让大语言模型模拟不同角色以增强其实际效用。然而,尽管已有研究指出大语言模型输出中存在社会偏见,但在角色扮演场景中这些偏见是否出现及其程度仍不明确。本文对大语言模型在角色扮演情境中的公平性测试进行了实证研究。为实施测试,我们利用大语言模型生成了涵盖11种人口属性共550个社会角色,并针对各种偏见类型构建了33,000个角色特定问题。这些问题涵盖"是否/是否"型、多项选择型和开放问答型,旨在促使大语言模型扮演指定角色并生成回应。我们结合基于规则和基于大语言模型的策略来识别偏见回应,并通过人工评估进行了严格验证。以生成的问题作为测试用例,我们对10种先进大语言模型进行了全面评估。结果显示,这些模型共产生107,580条带有偏见的回应,单个模型的偏见回应数量介于7,579至16,963条之间,凸显了角色扮演情境中偏见的普遍性。为支持后续研究,我们已公开数据集及所有脚本与实验结果。