How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation

Generative Pre-trained Transformer (GPT) models have shown remarkable capabilities for natural language generation, but their performance for machine translation has not been thoroughly investigated. In this paper, we present a comprehensive evaluation of GPT models for machine translation, covering various aspects such as quality of different GPT models in comparison with state-of-the-art research and commercial systems, effect of prompting strategies, robustness towards domain shifts and document-level translation. We experiment with eighteen different translation directions involving high and low resource languages, as well as non English-centric translations, and evaluate the performance of three GPT models: ChatGPT, GPT3.5 (text-davinci-003), and text-davinci-002. Our results show that GPT models achieve very competitive translation quality for high resource languages, while having limited capabilities for low resource languages. We also show that hybrid approaches, which combine GPT models with other translation systems, can further enhance the translation quality. We perform comprehensive analysis and human evaluation to further understand the characteristics of GPT translations. We hope that our paper provides valuable insights for researchers and practitioners in the field and helps to better understand the potential and limitations of GPT models for translation.

翻译：生成式预训练Transformer（GPT）模型在自然语言生成方面展现出卓越能力，但其在机器翻译任务上的性能尚未得到深入探究。本文对GPT模型的机器翻译能力进行了全面评估，涵盖多个维度，包括不同GPT模型与最新研究及商业系统的质量对比、提示策略的影响、领域迁移鲁棒性以及文档级翻译。我们针对涉及高资源与低资源语言、非英语中心翻译的十八种不同翻译方向进行了实验，并评估了三种GPT模型的表现：ChatGPT、GPT3.5（text-davinci-003）和text-davinci-002。结果表明，GPT模型在高资源语言上实现了极具竞争力的翻译质量，但在低资源语言上能力有限。我们还发现，结合GPT模型与其他翻译系统的混合方法可进一步提升翻译质量。通过全面分析与人工评估，我们进一步理解了GPT翻译的特性。希望本文能为领域内的研究人员与实践者提供宝贵见解，并有助于更深入地认识GPT模型在翻译任务中的潜力与局限性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日