Gender bias in machine translation (MT) systems poses a significant challenge to achieving accurate and inclusive translations. This paper examines gender bias in machine translation systems for languages such as Telugu and Kannada from the Dravidian family, analyzing how gender inflections affect translation accuracy and neutrality using Google Translate and ChatGPT. It finds that while plural forms can reduce bias, individual-centric sentences often maintain the bias due to historical stereotypes. The study evaluates the Chain of Thought processing, noting significant bias mitigation from 80% to 4% in Telugu and from 40% to 0% in Kannada. It also compares Telugu and Kannada translations, emphasizing the need for language specific strategies to address these challenges and suggesting directions for future research to enhance fairness in both data preparation and prompts during inference.
翻译:机器翻译系统中的性别偏见是实现准确且包容性翻译的重大挑战。本文研究了针对泰卢固语和卡纳达语等达罗毗荼语系语言的机器翻译系统中的性别偏见,利用Google Translate和ChatGPT分析了性别屈折变化如何影响翻译的准确性和中立性。研究发现,虽然复数形式可以减少偏见,但以个体为中心的句子常因历史刻板印象而维持偏见。本研究评估了思维链处理方式,指出在泰卢固语中偏见显著从80%降至4%,在卡纳达语中从40%降至0%。同时,本文比较了泰卢固语和卡纳达语的翻译结果,强调需要针对特定语言制定策略以应对这些挑战,并为未来研究提出方向,以在数据准备和推理过程中的提示设计两方面提升公平性。