Towards Making the Most of ChatGPT for Machine Translation

ChatGPT shows remarkable capabilities for machine translation (MT). Several prior studies have shown that it achieves comparable results to commercial systems for high-resource languages, but lags behind in complex tasks, e.g., low-resource and distant-language-pairs translation. However, they usually adopt simple prompts which can not fully elicit the capability of ChatGPT. In this paper, we aim to further mine ChatGPT's translation ability by revisiting several aspects: temperature, task information, and domain information, and correspondingly propose an optimal temperature setting and two (simple but effective) prompts: Task-Specific Prompts (TSP) and Domain-Specific Prompts (DSP). We show that: 1) The performance of ChatGPT depends largely on temperature, and a lower temperature usually can achieve better performance; 2) Emphasizing the task information can further improve ChatGPT's performance, particularly in complex MT tasks; 3) Introducing domain information can elicit ChatGPT's generalization ability and improve its performance in the specific domain; 4) ChatGPT tends to generate hallucinations for non-English-centric MT tasks, which can be partially addressed by our proposed prompts but still need to be highlighted for the MT/NLP community. We also explore the effects of advanced in-context learning strategies and find a (negative but interesting) observation: the powerful chain-of-thought prompt leads to word-by-word translation behavior, thus bringing significant translation degradation.

翻译：ChatGPT在机器翻译（MT）方面展现出显著能力。先前多项研究表明，ChatGPT在高资源语言翻译中能达到与商业系统相当的水平，但在复杂任务（如低资源语言及远距离语言对翻译）中仍存在差距。然而，这些研究通常采用简单提示（prompt），未能充分激发ChatGPT的潜力。本文旨在通过重新审视温度参数（temperature）、任务信息和领域信息三个维度，进一步挖掘ChatGPT的翻译能力，并相应提出最优温度设置及两种（简单而有效的）提示策略：任务特定提示（TSP）和领域特定提示（DSP）。我们发现：1）ChatGPT的性能严重依赖温度设置，较低温度通常能取得更优效果；2）强调任务信息可进一步提升ChatGPT性能，尤其在复杂机器翻译任务中；3）引入领域信息能激发ChatGPT的泛化能力，提升其在特定领域的表现力；4）ChatGPT在非英语中心的机器翻译任务中易产生幻觉（hallucination），我们的提示策略可部分缓解该问题，但仍需引起机器翻译与自然语言处理领域的关注。此外，我们探究了先进上下文学习策略的效果，并发现一个（负面但有趣的）现象：强大的思维链提示（chain-of-thought prompt）会引发逐词翻译行为，从而显著降低翻译质量。

相关内容

ChatGPT

关注 258

ChatGPT（全名：Chat Generative Pre-trained Transformer），美国OpenAI 研发的聊天机器人程序 [1] ，于2022年11月30日发布。ChatGPT是人工智能技术驱动的自然语言处理工具，它能够通过学习和理解人类的语言来进行对话，还能根据聊天的上下文进行互动，真正像人类一样来聊天交流，甚至能完成撰写邮件、视频脚本、文案、翻译、代码，写论文任务。 [1] https://openai.com/blog/chatgpt/

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日