The burgeoning influence of Large Language Models (LLMs) in shaping public discourse and decision-making underscores the imperative to address inherent biases within these AI systems. In the wake of AI's expansive integration across sectors, addressing racial bias in LLMs has never been more critical. This paper introduces a novel framework called Comprehensive Bias Neutralization Framework (CBNF) which embodies an innovative approach to quantifying and mitigating biases within LLMs. Our framework combines the Large Language Model Bias Index (LLMBI) [Oketunji, A., Anas, M., Saina, D., (2023)] and Bias removaL with No Demographics (BLIND) [Orgad, H., Belinkov, Y. (2023)] methodologies to create a new metric called Bias Intelligence Quotient (BiQ)which detects, measures, and mitigates racial bias in LLMs without reliance on demographic annotations. By introducing a new metric called BiQ that enhances LLMBI with additional fairness metrics, CBNF offers a multi-dimensional metric for bias assessment, underscoring the necessity of a nuanced approach to fairness in AI [Mehrabi et al., 2021]. This paper presents a detailed analysis of Latimer AI (a language model incrementally trained on black history and culture) in comparison to ChatGPT 3.5, illustrating Latimer AI's efficacy in detecting racial, cultural, and gender biases through targeted training and refined bias mitigation strategies [Latimer & Bender, 2023].
翻译:大型语言模型(LLM)在塑造公共话语和决策中的影响力日益增强,这凸显了解决这些人工智能系统固有偏差的紧迫性。随着人工智能在各行业的广泛整合,解决LLM中的种族偏差问题比以往任何时候都更为关键。本文提出了一种名为全面偏差中和框架(CBNF)的新型框架,该框架采用创新方法量化并缓解LLM中的偏差。该框架融合了大型语言模型偏差指数(LLMBI)[Oketunji, A., Anas, M., Saina, D., (2023)]和无人口统计信息的偏差去除(BLIND)[Orgad, H., Belinkov, Y. (2023)]方法,创建了一种名为偏差智能商(BiQ)的新指标,该指标无需依赖人口统计注释即可检测、衡量并缓解LLM中的种族偏差。通过引入BiQ这一新指标(该指标通过附加公平性指标增强了LLMBI),CBNF提供了用于偏差评估的多维指标,强调了在人工智能中需采用细致入微的公平性处理方法[Mehrabi et al., 2021]。本文详细分析了Latimer AI(一种基于黑历史和文化增量训练的语言模型)与ChatGPT 3.5的对比,展示了Latimer AI通过针对性训练和改进的偏差缓解策略在检测种族、文化和性别偏差方面的有效性[Latimer & Bender, 2023]。