We study the effect of tokenization on gender bias in machine translation, an aspect that has been largely overlooked in previous works. Specifically, we focus on the interactions between the frequency of gendered profession names in training data, their representation in the subword tokenizer's vocabulary, and gender bias. We observe that female and non-stereotypical gender inflections of profession names (e.g., Spanish "doctora" for "female doctor") tend to be split into multiple subword tokens. Our results indicate that the imbalance of gender forms in the model's training corpus is a major factor contributing to gender bias and has a greater impact than subword splitting. We show that analyzing subword splits provides good estimates of gender-form imbalance in the training data and can be used even when the corpus is not publicly available. We also demonstrate that fine-tuning just the token embedding layer can decrease the gap in gender prediction accuracy between female and male forms without impairing the translation quality.
翻译:我们研究了分词对机器翻译中性别偏见的影响,这是以往研究中很大程度上被忽视的一个方面。具体而言,我们关注训练数据中性别化职业名称的频率、其在子词分词器词汇表中的表示方式以及性别偏见之间的相互作用。我们观察到,职业名称的女性和非刻板性别屈折形式(例如,西班牙语中表示“女医生”的“doctora”)倾向于被分割成多个子词标记。我们的结果表明,模型训练语料库中性别形式的不平衡是导致性别偏见的一个主要因素,其影响大于子词分割。我们证明,分析子词分割可以很好地估计训练数据中性别形式的不平衡,并且即使语料库未公开也可使用。我们还表明,仅微调词嵌入层可以在不损害翻译质量的情况下缩小女性与男性形式之间性别预测准确率的差距。