BEATS: Bias Evaluation and Assessment Test Suite for Large Language Models

In this research, we introduce BEATS, a novel framework for evaluating Bias, Ethics, Fairness, and Factuality in Large Language Models (LLMs). Building upon the BEATS framework, we present a bias benchmark for LLMs that measure performance across 29 distinct metrics. These metrics span a broad range of characteristics, including demographic, cognitive, and social biases, as well as measures of ethical reasoning, group fairness, and factuality related misinformation risk. These metrics enable a quantitative assessment of the extent to which LLM generated responses may perpetuate societal prejudices that reinforce or expand systemic inequities. To achieve a high score on this benchmark a LLM must show very equitable behavior in their responses, making it a rigorous standard for responsible AI evaluation. Empirical results based on data from our experiment show that, 37.65\% of outputs generated by industry leading models contained some form of bias, highlighting a substantial risk of using these models in critical decision making systems. BEATS framework and benchmark offer a scalable and statistically rigorous methodology to benchmark LLMs, diagnose factors driving biases, and develop mitigation strategies. With the BEATS framework, our goal is to help the development of more socially responsible and ethically aligned AI models.

翻译：本研究提出BEATS，一个用于评估大语言模型偏见、伦理、公平性与事实性的创新框架。基于BEATS框架，我们构建了一个涵盖29项差异化指标的大语言模型偏见基准测试体系。这些指标覆盖广泛特性，包括人口统计学偏见、认知偏见与社会偏见，以及伦理推理能力、群体公平性和涉及虚假信息风险的事实性度量。通过该指标体系，可量化评估大语言模型生成响应在多大程度上延续可能强化或扩大系统性不平等的社会偏见。要在本基准测试中获得高分，大语言模型必须在响应中展现高度公平的行为，这使其成为负责任人工智能评估的严格标准。基于实验数据的实证结果表明，行业领先模型生成的输出中有37.65%存在某种形式的偏见，凸显了在关键决策系统中使用这些模型的重大风险。BEATS框架与基准测试提供了一种可扩展且统计严谨的方法论，可用于大语言模型基准测试、偏见驱动因素诊断以及缓解策略开发。通过BEATS框架，我们的目标是助力开发更具社会责任感且符合伦理规范的人工智能模型。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日