The training paradigm for machine translation has gradually shifted, from learning neural machine translation (NMT) models with extensive parallel corpora to instruction finetuning on multilingual large language models (LLMs) with high-quality translation pairs. In this paper, we focus on boosting many-to-many multilingual translation of LLMs with an emphasis on zero-shot translation directions. We demonstrate that prompt strategies adopted during finetuning are crucial to zero-shot translation and introduce a cross-lingual consistency regularization, XConST, to bridge the representation gap among different languages and improve zero-shot translation performance. XConST is not a new method, but a version of CrossConST (Gao et al., 2023a) adapted for translation instruction finetuning with LLMs. Experimental results on ALMA (Xu et al., 2023), Tower (Team, 2024), and LLaMA-2 (Touvron et al., 2023) show that our approach consistently improves translation performance. Our implementations are available at https://github.com/gpengzhi/CrossConST-LLM.
翻译:机器翻译的训练范式已逐渐从利用大量平行语料库训练神经机器翻译(NMT)模型,转向基于高质量翻译对在多多语言大语言模型(LLMs)上进行指令微调。本文聚焦于提升LLMs的多对多多语言翻译能力,特别强调零样本翻译方向。我们证明,微调过程中采用的提示策略对零样本翻译至关重要,并引入跨语言一致性正则化方法XConST,以弥合不同语言间的表征差距,提升零样本翻译性能。XConST并非全新方法,而是将CrossConST(Gao等人,2023a)适配至LLM翻译指令微调的版本。在ALMA(Xu等人,2023)、Tower(Team, 2024)和LLaMA-2(Touvron等人,2023)上的实验结果表明,我们的方法持续提升了翻译性能。实现代码已发布于https://github.com/gpengzhi/CrossConST-LLM。