Recent studies have revealed that the widely-used Pre-trained Language Models (PLMs) propagate societal biases from the large unmoderated pre-training corpora. Existing solutions require debiasing training processes and datasets for debiasing, which are resource-intensive and costly. Furthermore, these methods hurt the PLMs' performance on downstream tasks. In this study, we propose Gender-tuning, which debiases the PLMs through fine-tuning on downstream tasks' datasets. For this aim, Gender-tuning integrates Masked Language Modeling (MLM) training objectives into fine-tuning's training process. Comprehensive experiments show that Gender-tuning outperforms the state-of-the-art baselines in terms of average gender bias scores in PLMs while improving PLMs' performance on downstream tasks solely using the downstream tasks' dataset. Also, Gender-tuning is a deployable debiasing tool for any PLM that works with original fine-tuning.
翻译:近期研究表明,广泛使用的预训练语言模型(PLMs)会从大规模无监管的预训练语料中传播社会偏见。现有解决方案需要专门去偏的训练流程和数据集,资源消耗大且成本高昂。此外,这些方法会损害PLMs在下游任务中的性能。本研究提出Gender-tuning方法,通过在下游任务数据集上进行微调来实现PLMs去偏。为此,Gender-tuning将掩码语言建模(MLM)训练目标融入微调训练过程。综合实验表明,Gender-tuning在仅使用下游任务数据集的情况下,不仅能降低PLMs中的平均性别偏见分数,优于现有最先进基准方法,同时还能提升PLMs在下游任务上的性能。此外,Gender-tuning可作为与原始微调框架兼容的可部署去偏工具,适用于任意PLM。