Understanding Stereotypes in Language Models: Towards Robust Measurement and Zero-Shot Debiasing - 专知论文

会员服务 ·

0

有偏 · 可理解性 · 稳健性 · MoDELS · AIM ·

2022 年 12 月 20 日

Understanding Stereotypes in Language Models: Towards Robust Measurement and Zero-Shot Debiasing

翻译：理解语言模式中的陈规定型观念:走向强力衡量和零热贬低偏见

Justus Mattern,Zhijing Jin,Mrinmaya Sachan,Rada Mihalcea,Bernhard Schölkopf

Generated texts from large pretrained language models have been shown to exhibit a variety of harmful, human-like biases about various demographics. These findings prompted large efforts aiming to understand and measure such effects, with the goal of providing benchmarks that can guide the development of techniques mitigating these stereotypical associations. However, as recent research has pointed out, the current benchmarks lack a robust experimental setup, consequently hindering the inference of meaningful conclusions from their evaluation metrics. In this paper, we extend these arguments and demonstrate that existing techniques and benchmarks aiming to measure stereotypes tend to be inaccurate and consist of a high degree of experimental noise that severely limits the knowledge we can gain from benchmarking language models based on them. Accordingly, we propose a new framework for robustly measuring and quantifying biases exhibited by generative language models. Finally, we use this framework to investigate GPT-3's occupational gender bias and propose prompting techniques for mitigating these biases without the need for fine-tuning.

翻译：事实证明,从经过培训的大型语言模型中产生的文本显示了对各种人口结构的各种有害、人性的偏见,这些调查结果促使人们作出巨大努力,以了解和衡量这些影响,目的是提供基准,指导发展减少这些陈规定型协会的技术,然而,正如最近的研究表明,目前的基准缺乏强有力的实验性结构,从而阻碍了从其评价指标中得出有意义的结论的推论。在本文件中,我们扩展了这些论点,并表明,旨在衡量陈规定型观念的现有技术和基准往往不准确,而且包含高度的实验性噪音,严重限制了我们从基准语言模型中获得的知识。因此,我们提出一个新的框架,以有力衡量和量化典型语言模型所显示的偏见。最后,我们利用这个框架调查GPT-3的职业性别偏见,并提出在不需要微调的情况下减轻这些偏见的迅速技术。

0

相关内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

miR-5591靶向AGER/ROS/JNK抑制MSCs氧化应激损伤在糖尿病创面修复中的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

Resveratrol联合MSCs移植对阿尔茨海默鼠的干预效果及Sirt1分子信号的介导作用

国家自然科学基金

0+阅读 · 2014年12月31日

高动态吸波和散射信道下非均匀多载波编码调制技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

β-catenin/Ets1复合体在胶质母细胞瘤中对hTERT表达调控机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

肌萎缩脊髓侧索硬化症关键致病基因的细胞模型及分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

组蛋白甲基化酶复合物COMPASS催化的H3K4me2,H3K4me3对果蝇发育调控的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

多维高次有限元超收敛后处理研究

国家自然科学基金

0+阅读 · 2011年12月31日

shRNA干扰mTOR信号途径抑制镍诱导的Cap43基因表达的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

MultiViz: Towards Visualizing and Understanding Multimodal Models

Arxiv

0+阅读 · 2023年2月20日

How would Stance Detection Techniques Evolve after the Launch of ChatGPT?

Arxiv

1+阅读 · 2023年2月18日

Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints

Arxiv

0+阅读 · 2023年2月17日

A Survey on Graph Counterfactual Explanations: Definitions, Methods, Evaluation

Arxiv

12+阅读 · 2022年10月21日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

A Survey of Deep Learning for Low-Shot Object Detection

Arxiv

21+阅读 · 2021年12月6日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

VIP会员

文章信息

相关主题

最新内容

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

专知会员服务

3+阅读 · 今天8:18

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

专知会员服务

3+阅读 · 今天7:39

《通用大语言模型：无人机指挥与控制接口》最新40页

《通用大语言模型：无人机指挥与控制接口》最新40页

专知会员服务

8+阅读 · 今天7:33

《通过小型无人机系统将情报能力“作战化”》

《通过小型无人机系统将情报能力“作战化”》

专知会员服务

3+阅读 · 今天7:28

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

专知会员服务

5+阅读 · 今天7:14

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

专知会员服务

18+阅读 · 6月15日

消耗优势：美军的“精确规模化”概念

消耗优势：美军的“精确规模化”概念

专知会员服务

7+阅读 · 6月15日

五角大楼的AI优先战略及其对现代战争的启示：来自与伊朗冲突的经验教训

五角大楼的AI优先战略及其对现代战争的启示：来自与伊朗冲突的经验教训

专知会员服务

8+阅读 · 6月15日

《网络空间兵棋推演：挑战、局限性与混合路径》报告

《网络空间兵棋推演：挑战、局限性与混合路径》报告

专知会员服务

8+阅读 · 6月15日

《离线语言支持系统：面向空战战术决策》

《离线语言支持系统：面向空战战术决策》

专知会员服务

8+阅读 · 6月15日

《以通信为中心的6G–LLM架构：面向可扩展的战术自主防御车辆网络》

《以通信为中心的6G–LLM架构：面向可扩展的战术自主防御车辆网络》

专知会员服务

6+阅读 · 6月15日

ICML 2026｜ECA：面向开放式图文生成的高效持续对齐

ICML 2026｜ECA：面向开放式图文生成的高效持续对齐

专知会员服务

6+阅读 · 6月14日

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

专知会员服务

6+阅读 · 6月14日

俄乌战场地面机器人如何改写战争规则

俄乌战场地面机器人如何改写战争规则

专知会员服务

9+阅读 · 6月14日

美国海军研究生院第23届年度采购研究研讨会与创新峰会：主题“加速作战能力”，附会议报告论文集1300页

美国海军研究生院第23届年度采购研究研讨会与创新峰会：主题“加速作战能力”，附会议报告论文集1300页

专知会员服务

13+阅读 · 6月14日

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

《通过小型无人机系统将情报能力“作战化”》

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

《通用大语言模型：无人机指挥与控制接口》最新40页

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

相关论文

MultiViz: Towards Visualizing and Understanding Multimodal Models

Arxiv

0+阅读 · 2023年2月20日

How would Stance Detection Techniques Evolve after the Launch of ChatGPT?

Arxiv

1+阅读 · 2023年2月18日

Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints

Arxiv

0+阅读 · 2023年2月17日

A Survey on Graph Counterfactual Explanations: Definitions, Methods, Evaluation

Arxiv

12+阅读 · 2022年10月21日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

A Survey of Deep Learning for Low-Shot Object Detection

Arxiv

21+阅读 · 2021年12月6日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

相关基金

miR-5591靶向AGER/ROS/JNK抑制MSCs氧化应激损伤在糖尿病创面修复中的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

Resveratrol联合MSCs移植对阿尔茨海默鼠的干预效果及Sirt1分子信号的介导作用

国家自然科学基金

0+阅读 · 2014年12月31日

高动态吸波和散射信道下非均匀多载波编码调制技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

β-catenin/Ets1复合体在胶质母细胞瘤中对hTERT表达调控机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

肌萎缩脊髓侧索硬化症关键致病基因的细胞模型及分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

组蛋白甲基化酶复合物COMPASS催化的H3K4me2,H3K4me3对果蝇发育调控的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

多维高次有限元超收敛后处理研究

国家自然科学基金

0+阅读 · 2011年12月31日

shRNA干扰mTOR信号途径抑制镍诱导的Cap43基因表达的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员