Large Language Models (LLMs) often perpetuate biases in pronoun usage, leading to misrepresentation or exclusion of queer individuals. This paper addresses the specific problem of biased pronoun usage in LLM outputs, particularly the inappropriate use of traditionally gendered pronouns ("he," "she") when inclusive language is needed to accurately represent all identities. We introduce a collaborative agent pipeline designed to mitigate these biases by analyzing and optimizing pronoun usage for inclusivity. Our multi-agent framework includes specialized agents for both bias detection and correction. Experimental evaluations using the Tango dataset-a benchmark focused on gender pronoun usage-demonstrate that our approach significantly improves inclusive pronoun classification, achieving a 32.6 percentage point increase over GPT-4o in correctly disagreeing with inappropriate traditionally gendered pronouns $(\chi^2 = 38.57, p < 0.0001)$. These results accentuate the potential of agent-driven frameworks in enhancing fairness and inclusivity in AI-generated content, demonstrating their efficacy in reducing biases and promoting socially responsible AI.
翻译:大型语言模型(LLM)在代词使用中常延续偏见,导致对酷儿群体的误表征或排斥。本文针对LLM输出中代词使用的偏见问题,特别是在需要包容性语言以准确表征所有身份时,不当使用传统性别代词("he"、"she")的现象。我们提出一种协作智能体流程,通过分析与优化代词使用以提升包容性。该多智能体框架包含专门用于偏见检测与修正的智能体。基于Tango数据集(专注于性别代词使用的基准测试)的实验评估表明,我们的方法显著提升了包容性代词分类能力,在正确反驳不当传统性别代词的任务上相比GPT-4o实现了32.6个百分点的提升$(\chi^2 = 38.57, p < 0.0001)$。这些结果凸显了智能体驱动框架在提升AI生成内容公平性与包容性方面的潜力,证明了其在减少偏见、促进社会责任型人工智能方面的有效性。