While understanding and removing gender biases in language models has been a long-standing problem in Natural Language Processing, prior research work has primarily been limited to English. In this work, we investigate some of the challenges with evaluating and mitigating biases in multilingual settings which stem from a lack of existing benchmarks and resources for bias evaluation beyond English especially for non-western context. In this paper, we first create a benchmark for evaluating gender biases in pre-trained masked language models by extending DisCo to different Indian languages using human annotations. We extend various debiasing methods to work beyond English and evaluate their effectiveness for SOTA massively multilingual models on our proposed metric. Overall, our work highlights the challenges that arise while studying social biases in multilingual settings and provides resources as well as mitigation techniques to take a step toward scaling to more languages.
翻译:尽管理解和消除语言模型中的性别偏见一直是自然语言处理领域的长期问题,但先前的研究工作主要局限于英语。本研究探讨了在多语言环境中评估和缓解偏见所面临的一些挑战,这些挑战源于缺乏英语之外(尤其是非西方语境下)用于偏见评估的现有基准和资源。本文首先通过利用人工标注将DisCo扩展至多种印度语言,构建了一个用于评估预训练掩码语言模型中性别偏见的基准。我们将多种去偏见方法扩展至英语之外,并基于我们提出的指标评估这些方法对SOTA大规模多语言模型的有效性。总体而言,我们的工作凸显了在多语言环境中研究社会偏见时出现的挑战,同时提供了推进该方法向更多语言扩展所需的资源与缓解技术。