Large-scale Pretrained Language Models (LLMs), such as ChatGPT and GPT4, have shown strong abilities in multilingual translations, without being explicitly trained on parallel corpora. It is interesting how the LLMs obtain their ability to carry out translation instructions for different languages. In this paper, we present a detailed analysis by finetuning a multilingual pretrained language model, XGLM-7B, to perform multilingual translation following given instructions. Firstly, we show that multilingual LLMs have stronger translation abilities than previously demonstrated. For a certain language, the performance depends on its similarity to English and the amount of data used in the pretraining phase. Secondly, we find that LLMs' ability to carry out translation instructions relies on the understanding of translation instructions and the alignment among different languages. With multilingual finetuning, LLMs could learn to perform the translation task well even for those language pairs unseen during the instruction tuning phase.
翻译:大规模预训练语言模型(LLMs),如ChatGPT和GPT4,在未经过平行语料显式训练的情况下,已展现出强大的多语言翻译能力。探究LLMs如何获得遵循不同语言翻译指令的能力颇具意义。本文通过对多语言预训练语言模型XGLM-7B进行微调,使其能够根据给定指令执行多语言翻译,并给出了详细分析。首先,我们表明多语言LLMs具备比先前研究展示的更强大的翻译能力。对于特定语言而言,其性能取决于该语言与英语的相似度以及预训练阶段使用的数据量。其次,我们发现LLMs执行翻译指令的能力依赖于对翻译指令的理解以及不同语言之间的对齐。通过多语言微调,LLMs能够学会良好地执行翻译任务,即使是针对指令微调阶段未见过的语言对也能有效处理。