Machine translation often suffers from biased data and algorithms that can lead to unacceptable errors in system output. While bias in gender norms has been investigated, less is known about whether MT systems encode bias about social relationships, e.g., "the lawyer kissed her wife." We investigate the degree of bias against same-gender relationships in MT systems, using generated template sentences drawn from several noun-gender languages (e.g., Spanish) and comprised of popular occupation nouns. We find that three popular MT services consistently fail to accurately translate sentences concerning relationships between entities of the same gender. The error rate varies considerably based on the context, and same-gender sentences referencing high female-representation occupations are translated with lower accuracy. We provide this work as a case study in the evaluation of intrinsic bias in NLP systems with respect to social relationships.
翻译:机器翻译常因存在偏见的数据和算法而导致系统输出中出现不可接受的错误。尽管性别规范偏见已得到研究,但关于机器翻译系统是否编码了社会关系偏见(例如"律师亲吻了她的妻子")的问题仍知之甚少。本研究通过使用从多种名词-性别语言(如西班牙语)中提取的生成模板句子(包含常见职业名词),探究了机器翻译系统中针对同性关系偏见的程度。我们发现三种主流机器翻译服务在翻译涉及同性别实体关系的句子时持续存在准确性问题。错误率随语境变化显著,且涉及高女性占比职业的同性关系句子翻译准确率更低。本研究为评估自然语言处理系统在社会关系方面的内在偏见提供了案例研究。