This report provides a preliminary evaluation of ChatGPT for machine translation, including translation prompt, multilingual translation, and translation robustness. We adopt the prompts advised by ChatGPT to trigger its translation ability and find that the candidate prompts generally work well and show minor performance differences. By evaluating on a number of benchmark test sets, we find that ChatGPT performs competitively with commercial translation products (e.g., Google Translate) on high-resource European languages but lags behind significantly on low-resource or distant languages. For distant languages, we explore an interesting strategy named $\mathbf{pivot~prompting}$ that asks ChatGPT to translate the source sentence into a high-resource pivot language before into the target language, which improves the translation performance significantly. As for the translation robustness, ChatGPT does not perform as well as the commercial systems on biomedical abstracts or Reddit comments but is potentially a good translator for spoken language. Scripts and data: https://github.com/wxjiao/Is-ChatGPT-A-Good-Translator
翻译:本报告对ChatGPT在机器翻译方面的表现进行了初步评估,涵盖翻译提示、多语言翻译及翻译鲁棒性。我们采用ChatGPT推荐的提示来激发其翻译能力,发现候选提示总体效果良好且性能差异较小。通过多个基准测试集的评估,我们发现ChatGPT在高资源欧洲语言上能与商业翻译产品(如谷歌翻译)竞争,但在低资源或语系差异大的语言上显著落后。针对语系差异大的语言,我们探索了一种名为$\mathbf{pivot~prompting}$的有趣策略:要求ChatGPT先将源句子翻译为高资源中间语言,再译为目标语言,从而显著提升翻译性能。在翻译鲁棒性方面,ChatGPT在生物医学摘要或Reddit评论上的表现不如商业系统,但可能是一种适合口语翻译的优质工具。脚本与数据:https://github.com/wxjiao/Is-ChatGPT-A-Good-Translator