Large-scale Pretrained Language Models (LLMs), such as ChatGPT and GPT4, have shown strong abilities in multilingual translations, without being explicitly trained on parallel corpora. It is interesting how the LLMs obtain their ability to carry out translation instructions for different languages. In this paper, we present a detailed analysis by finetuning a multilingual pretrained language model, XGLM-7B, to perform multilingual translation following given instructions. Firstly, we show that multilingual LLMs have stronger translation abilities than previously demonstrated. For a certain language, the performance depends on its similarity to English and the amount of data used in the pretraining phase. Secondly, we find that LLMs' ability to carry out translation instructions relies on the understanding of translation instructions and the alignment among different languages. With multilingual finetuning, LLMs could learn to perform the translation task well even for those language pairs unseen during the instruction tuning phase.
翻译:大规模预训练语言模型,如ChatGPT和GPT4,在未经过平行语料显式训练的情况下,已展现出强大的多语言翻译能力。这些大型语言模型如何获得执行不同语言翻译指令的能力是一个有趣的问题。本文通过对多语言预训练语言模型XGLM-7B进行微调,使其按照给定指令执行多语言翻译,并进行了详细分析。首先,我们证明多语言大型语言模型具有比先前研究所展示的更强的翻译能力。对于特定语言,其性能取决于该语言与英语的相似度以及预训练阶段使用的数据量。其次,我们发现大型语言模型执行翻译指令的能力依赖于对翻译指令的理解以及不同语言之间的对齐。通过多语言微调,大型语言模型能够学会良好地执行翻译任务,即使对于在指令微调阶段未见过的语言对也是如此。