We study the effect of tokenization on gender bias in machine translation, an aspect that has been largely overlooked in previous works. Specifically, we focus on the interactions between the frequency of gendered profession names in training data, their representation in the subword tokenizer's vocabulary, and gender bias. We observe that female and non-stereotypical gender inflections of profession names (e.g., Spanish "doctora" for "female doctor") tend to be split into multiple subword tokens. Our results indicate that the imbalance of gender forms in the model's training corpus is a major factor contributing to gender bias and has a greater impact than subword splitting. We show that analyzing subword splits provides good estimates of gender-form imbalance in the training data and can be used even when the corpus is not publicly available. We also demonstrate that fine-tuning just the token embedding layer can decrease the gap in gender prediction accuracy between female and male forms without impairing the translation quality.
翻译:我们研究了分词对机器翻译中性别偏见的影响,这一方面在以往工作中很大程度上被忽视。具体而言,我们聚焦于训练数据中性别化职业名称的频率、其在子词分词器词汇表中的表示方式以及性别偏见之间的相互作用。我们观察到,职业名称的女性化及非刻板性别屈折形式(例如,西班牙语中表示“女医生”的“doctora”)往往被拆分为多个子词标记。我们的结果表明,模型训练语料库中性别形式的不平衡是导致性别偏见的主要因素,其影响比子词拆分更大。我们展示了分析子词拆分可以有效估计训练数据中性别形式的不平衡,即使语料库未公开也可使用此方法。我们还证明,仅微调词嵌入层即可缩小女性与男性形式之间性别预测准确率的差距,且不会损害翻译质量。