The widespread adoption of large language models (LLMs) underscores the urgent need to ensure their fairness. However, LLMs frequently present dominant viewpoints while ignoring alternative perspectives from minority parties, resulting in potential biases. We hypothesize that these fairness-violating behaviors occur because LLMs express their viewpoints using a human personality that represents the majority of training data. In response to this, we validate that prompting LLMs with specific roles can allow LLMs to express diverse viewpoints. Building on this insight and observation, we develop FairThinking, a pipeline designed to automatically generate roles that enable LLMs to articulate diverse perspectives for fair expressions. To evaluate FairThinking, we create a dataset with a thousand items covering three fairness-related topics and conduct experiments on GPT-3.5, GPT-4, Llama2, and Mistral to demonstrate its superior performance.
翻译:大语言模型(LLM)的广泛应用凸显了确保其公平性的迫切需求。然而,LLM常呈现主流观点而忽略少数群体的替代视角,导致潜在偏见。我们假设这种违反公平性的行为源于LLM使用代表训练数据主流的人类人格表达观点。为应对这一问题,我们验证了通过特定角色提示可让LLM表达多元化观点。基于这一洞察,我们开发了FairThinking流水线,该流程可自动生成角色,使LLM能呈现多元视角以实现公平表达。为评估FairThinking,我们创建了涵盖三个公平性相关主题的千项数据集,并在GPT-3.5、GPT-4、Llama2和Mistral上开展实验,证明其卓越性能。