Machine Translation (MT) continues to improve in quality and adoption, yet the inadvertent perpetuation of gender bias remains a significant concern. Despite numerous studies into gender bias in translations from gender-neutral languages such as Turkish into more strongly gendered languages like English, there are no benchmarks for evaluating this phenomenon or for assessing mitigation strategies. To address this gap, we introduce GATE X-E, an extension to the GATE (Rarrick et al., 2023) corpus, that consists of human translations from Turkish, Hungarian, Finnish, and Persian into English. Each translation is accompanied by feminine, masculine, and neutral variants for each possible gender interpretation. The dataset, which contains between 1250 and 1850 instances for each of the four language pairs, features natural sentences with a wide range of sentence lengths and domains, challenging translation rewriters on various linguistic phenomena. Additionally, we present an English gender rewriting solution built on GPT-3.5 Turbo and use GATE X-E to evaluate it. We open source our contributions to encourage further research on gender debiasing.
翻译:机器翻译(MT)在质量和应用上持续进步,但无意中延续的性别偏见仍是一个重要问题。尽管已有大量研究探讨从土耳其语等中性语言向英语等强性别区分语言翻译中的性别偏见,但目前尚缺乏评估这一现象或缓解策略的基准。为填补这一空白,我们提出了GATE X-E,这是GATE语料库(Rarrick等,2023)的扩展,包含从土耳其语、匈牙利语、芬兰语和波斯语到英语的人工翻译。每个翻译都配有对应每种可能性别解释的女性、男性和中性变体。该数据集包含四种语言对各1250至1850个实例,采用自然语句,涵盖多种句子长度和领域,挑战翻译改写器处理多种语言现象的能力。此外,我们基于GPT-3.5 Turbo构建了一个英语性别改写解决方案,并使用GATE X-E对其进行评估。我们开源这些贡献,以鼓励关于性别去偏的进一步研究。