The evaluation of societal biases in NLP models is critically hindered by a geo-cultural gap, This leaves regions such as Latin America severely underserved, making it impossible to adequately assess or mitigate the perpetuation of harmful regional stereotypes in language technologies. This paper presents LACES, a stereotype association dataset, for 15 Latin American countries. This dataset includes 4,789 stereotype associations manually created and annotated by 83 participants. The dataset was developed through targeted community partnerships across Latin America. Additionally, in this paper, we propose a novel adaptive data collection methodology that uniquely integrates the sourcing of new stereotype entries and the validation of existing data within a single, unified workflow. This approach results in a resource with more unique stereotypes than previous static collection methods, enabling a more efficient stereotype collection. The paper further supports the quality of LACES by demonstrating reduced efficacy of debiasing methods on this dataset in comparison to existing popular stereotype benchmarks.
翻译:自然语言处理模型的社会偏见评估受到地域文化鸿沟的严重制约,这导致拉丁美洲等地区处于严重服务不足的状态,使得我们无法充分评估或缓解语言技术中有害区域刻板印象的延续。本文提出了LACES——一个涵盖15个拉丁美洲国家的刻板印象关联数据集。该数据集包含由83名参与者手动创建和标注的4,789条刻板印象关联。该数据集通过拉丁美洲范围内的定向社区合作开发而成。此外,本文提出了一种新颖的自适应数据收集方法,该方法独特地将新刻板印象条目的采集与现有数据的验证整合在一个统一的工作流程中。相较于以往的静态收集方法,该方法能够收集到更多独特的刻板印象,从而实现更高效的刻板印象收集。本文进一步通过展示在此数据集上偏见缓解方法的效果相较于现有主流刻板印象基准有所下降,从而支持了LACES的质量。