《English Please：评估机器翻译在多语言错误报告中的应用效果》 (English Please: Evaluating Machine Translation for Multilingual Bug Reports)

Accurate translation of bug reports is critical for efficient collaboration in global software development. In this study, we conduct the first comprehensive evaluation of machine translation (MT) performance on bug reports, analyzing the capabilities of DeepL, AWS Translate, and ChatGPT using data from the Visual Studio Code GitHub repository, specifically focusing on reports labeled with the english-please tag. To thoroughly assess the accuracy and effectiveness of each system, we employ multiple machine translation metrics, including BLEU, BERTScore, COMET, METEOR, and ROUGE. Our findings indicate that DeepL consistently outperforms the other systems across most automatic metrics, demonstrating strong lexical and semantic alignment. AWS Translate performs competitively, particularly in METEOR, while ChatGPT lags in key metrics. This study underscores the importance of domain adaptation for translating technical texts and offers guidance for integrating automated translation into bug-triaging workflows. Moreover, our results establish a foundation for future research to refine machine translation solutions for specialized engineering contexts. The code and dataset for this paper are available at GitHub: https://github.com/av9ash/gitbugs/tree/main/multilingual.

翻译：准确翻译错误报告对于全球软件开发中的高效协作至关重要。本研究首次对错误报告的机器翻译性能进行全面评估，利用Visual Studio Code GitHub仓库中标记为english-please标签的报告数据，系统分析了DeepL、AWS Translate和ChatGPT的翻译能力。为深入评估各系统的准确性与有效性，我们采用多种机器翻译评价指标，包括BLEU、BERTScore、COMET、METEOR和ROUGE。研究结果表明，在大多数自动评价指标中，DeepL持续优于其他系统，展现出强大的词汇与语义对齐能力。AWS Translate表现具有竞争力，尤其在METEOR指标上表现突出，而ChatGPT在关键指标上相对滞后。本研究强调了领域适应在技术文本翻译中的重要性，并为将自动化翻译集成到错误分诊工作流程提供了实践指导。此外，我们的研究成果为未来在专业工程场景中优化机器翻译解决方案奠定了研究基础。本文代码与数据集已发布于GitHub：https://github.com/av9ash/gitbugs/tree/main/multilingual。

相关内容

Machine Translation

关注 210

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日