We employ an audit design to investigate biases in state-of-the-art large language models, including GPT-4. In our study, we prompt the models for advice involving a named individual across a variety of scenarios, such as during car purchase negotiations or election outcome predictions. We find that the advice systematically disadvantages names that are commonly associated with racial minorities and women. Names associated with Black women receive the least advantageous outcomes. The biases are consistent across 42 prompt templates and several models, indicating a systemic issue rather than isolated incidents. While providing numerical, decision-relevant anchors in the prompt can successfully counteract the biases, qualitative details have inconsistent effects and may even increase disparities. Our findings underscore the importance of conducting audits at the point of LLM deployment and implementation to mitigate their potential for harm against marginalized communities.
翻译:我们采用审计设计调查了包括GPT-4在内的先进大型语言模型中的偏见。研究中,我们引导模型为涉及指定姓名的个体提供建议,涵盖汽车购买谈判或选举结果预测等多种场景。结果发现,此类建议系统性地不利于与少数族裔和女性相关联的姓名。与非裔女性相关的姓名获得的结果最为不利。这种偏见在42种提示模板及多个模型中表现一致,表明这属于系统性问题而非孤立事件。尽管在提示中提供数值化、与决策相关的锚点能有效抵消偏见,但定性细节的作用却存在不一致性,甚至可能加剧差异。我们的发现强调了在部署与实施大型语言模型时进行审计的重要性,以减轻其对边缘化群体可能造成的危害。