ChatGPT Perpetuates Gender Bias in Machine Translation and Ignores Non-Gendered Pronouns: Findings across Bengali and Five other Low-Resource Languages

ChatGPT · 有偏 · Performer · Machine Translation · 相同 ·

2023 年 5 月 17 日

翻译：ChatGPT 在机器翻译中延续性别偏见并忽略非二元性别代词：针对孟加拉语及其他五种低资源语言的发现

Sourojit Ghosh,Aylin Caliskan

from arxiv, 12 pages, 9 figures, Upcoming Publication in AAAI/ACM Conference on AI, Ethics, and Society 2023

In this multicultural age, language translation is one of the most performed tasks, and it is becoming increasingly AI-moderated and automated. As a novel AI system, ChatGPT claims to be proficient in such translation tasks and in this paper, we put that claim to the test. Specifically, we examine ChatGPT's accuracy in translating between English and languages that exclusively use gender-neutral pronouns. We center this study around Bengali, the 7$^{th}$ most spoken language globally, but also generalize our findings across five other languages: Farsi, Malay, Tagalog, Thai, and Turkish. We find that ChatGPT perpetuates gender defaults and stereotypes assigned to certain occupations (e.g. man = doctor, woman = nurse) or actions (e.g. woman = cook, man = go to work), as it converts gender-neutral pronouns in languages to `he' or `she'. We also observe ChatGPT completely failing to translate the English gender-neutral pronoun `they' into equivalent gender-neutral pronouns in other languages, as it produces translations that are incoherent and incorrect. While it does respect and provide appropriately gender-marked versions of Bengali words when prompted with gender information in English, ChatGPT appears to confer a higher respect to men than to women in the same occupation. We conclude that ChatGPT exhibits the same gender biases which have been demonstrated for tools like Google Translate or MS Translator, as we provide recommendations for a human centered approach for future designers of AIs that perform language translation to better accommodate such low-resource languages.

翻译：在这个多元文化时代，语言翻译是最常执行的任务之一，且日益受到人工智能管理和自动化。作为新型人工智能系统，ChatGPT 声称精通此类翻译任务，而本文对其这一主张进行了检验。具体而言，我们考察了 ChatGPT 在英语与仅使用中性性别代词的语言之间进行翻译的准确性。本研究以全球第七大使用语言孟加拉语为中心，同时将发现推广至其他五种语言：波斯语、马来语、他加禄语、泰语和土耳其语。我们发现，ChatGPT 在将中性性别代词转换为“他”或“她”时，延续了针对特定职业（如男性=医生，女性=护士）或行为（如女性=做饭，男性=上班）的性别默认设定与刻板印象。我们还观察到，ChatGPT 完全无法将英语中性性别代词“they”翻译成其他语言中对应的中性性别代词，从而产生不连贯且错误的译文。尽管在英语提示中提供性别信息时，ChatGPT 会对孟加拉语词汇给出尊重且适当标注性别的版本，但它在同一职业中似乎给予男性比女性更高的尊重。我们得出结论，ChatGPT 表现出与 Google Translate 或 MS Translator 等工具相同的性别偏见，并为此类执行语言翻译任务的人工智能未来设计者提供建议，即采用以人为中心的方法以更好地适应这些低资源语言。

相关内容

ChatGPT

关注 258

ChatGPT（全名：Chat Generative Pre-trained Transformer），美国OpenAI 研发的聊天机器人程序 [1] ，于2022年11月30日发布。ChatGPT是人工智能技术驱动的自然语言处理工具，它能够通过学习和理解人类的语言来进行对话，还能根据聊天的上下文进行互动，真正像人类一样来聊天交流，甚至能完成撰写邮件、视频脚本、文案、翻译、代码，写论文任务。 [1] https://openai.com/blog/chatgpt/

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日