While gender bias in modern Neural Machine Translation (NMT) systems has received much attention, traditional evaluation metrics do not to fully capture the extent to which these systems integrate contextual gender cues. We propose a novel evaluation metric called Minimal Pair Accuracy (MPA), which measures the reliance of models on gender cues for gender disambiguation. MPA is designed to go beyond surface-level gender accuracy metrics by focusing on whether models adapt to gender cues in minimal pairs -- sentence pairs that differ solely in the gendered pronoun, namely the explicit indicator of the target's entity gender in the source language (EN). We evaluate a number of NMT models on the English-Italian (EN--IT) language pair using this metric, we show that they ignore available gender cues in most cases in favor of (statistical) stereotypical gender interpretation. We further show that in anti-stereotypical cases, these models tend to more consistently take masculine gender cues into account while ignoring the feminine cues. Furthermore, we analyze the attention head weights in the encoder component and show that while all models encode gender information to some extent, masculine cues elicit a more diffused response compared to the more concentrated and specialized responses to feminine gender cues.
翻译:尽管现代神经机器翻译系统中的性别偏见已受到广泛关注,但传统评估指标未能充分衡量这些系统整合上下文性别线索的程度。我们提出了一种名为最小对准确率的新型评估指标,用于衡量模型对性别线索进行性别消歧的依赖程度。该指标旨在超越表层性别准确率指标,重点关注模型是否在最小对(即仅在人称代词上存在差异的句子对)中适应性别线索——具体指源语言(英语)中明确指示目标实体性别的代词。通过在英语-意大利语语言对上使用该指标评估多个神经机器翻译模型,我们发现这些模型在多数情况下会忽略可用的性别线索,转而采用(统计意义上的)刻板性别解释。我们进一步证明,在反刻板语境中,这些模型往往更稳定地考虑男性性别线索,同时忽略女性线索。此外,我们分析了编码器组件中的注意力头权重,结果表明:虽然所有模型都在一定程度上编码了性别信息,但与对女性性别线索更集中且专门化的响应相比,男性线索引发的响应分布更为分散。