Implicit gender bias in Large Language Models (LLMs) is a well-documented problem, and implications of gender introduced into automatic translations can perpetuate real-world biases. However, some LLMs use heuristics or post-processing to mask such bias, making investigation difficult. Here, we examine bias in LLMss via back-translation, using the DeepL translation API to investigate the bias evinced when repeatedly translating a set of 56 Software Engineering tasks used in a previous study. Each statement starts with 'she', and is translated first into a 'genderless' intermediate language then back into English; we then examine pronoun- choice in the back-translated texts. We expand prior research in the following ways: (1) by comparing results across five intermediate languages, namely Finnish, Indonesian, Estonian, Turkish and Hungarian; (2) by proposing a novel metric for assessing the variation in gender implied in the repeated translations, avoiding the over-interpretation of individual pronouns, apparent in earlier work; (3) by investigating sentence features that drive bias; (4) and by comparing results from three time-lapsed datasets to establish the reproducibility of the approach. We found that some languages display similar patterns of pronoun use, falling into three loose groups, but that patterns vary between groups; this underlines the need to work with multiple languages. We also identify the main verb appearing in a sentence as a likely significant driver of implied gender in the translations. Moreover, we see a good level of replicability in the results, and establish that our variation metric proves robust despite an obvious change in the behaviour of the DeepL translation API during the course of the study. These results show that the back-translation method can provide further insights into bias in language models.
翻译:大型语言模型(LLMs)中的隐性性别偏见是一个记录充分的问题,而自动翻译中引入的性别含义可能加剧现实世界中的偏见。然而,部分LLMs通过启发式方法或后处理掩盖此类偏见,增加了研究的难度。本文通过回译方法(利用DeepL翻译API)考察LLMs中的偏见,对先前研究中使用的56个软件工程任务陈述进行重复翻译。每个陈述以'she'开头,先译为"无性别"的中介语言,再回译至英语,随后分析回译文本中的代词选择。我们通过以下方式扩展了先前研究:(1)比较五种中介语言(芬兰语、印尼语、爱沙尼亚语、土耳其语和匈牙利语)的结果;(2)提出评估重复翻译中隐含性别变动的新指标,避免早期研究中过度解读单个代词的问题;(3)探究驱动偏见的句子特征;(4)对比三个时间跨度的数据集以验证方法的可复现性。研究发现,某些语言在代词使用模式上呈现相似性,可归为三个松散组别,但组间模式存在差异——这凸显了使用多语言研究的必要性。我们还发现句子中的主要动词是翻译中隐含性别的重要驱动因素。此外,研究结果显示出良好的可复现性,并且尽管研究过程中DeepL翻译API的行为发生明显变化,我们提出的变动指标仍保持稳健。这些结果表明,回译方法能为语言模型中的偏见研究提供更深入的见解。