As instruction-tuned large language models (LLMs) gain global adoption, their ability to follow instructions in multiple languages becomes increasingly crucial. One promising approach is cross-lingual transfer, where a model acquires specific functionality on some language by finetuning on another language. In this work, we investigate how multilinguality during instruction tuning of a multilingual LLM affects instruction-following across languages. We first show that many languages transfer some instruction-following capabilities to other languages from even monolingual tuning. Furthermore, we find that only 40 multilingual examples in an English tuning set substantially improve multilingual instruction-following, both in seen and unseen languages during tuning. In general, we observe that models tuned on multilingual mixtures exhibit comparable or superior performance in several languages compared to monolingually tuned models, despite training on 10x fewer examples in those languages. Finally, we find that increasing the number of languages in the instruction tuning set from 1 to only 2, 3, or 4 increases cross-lingual generalization. Our results suggest that building massively multilingual instruction-tuned models can be done with only a very small set of multilingual instruction-responses.
翻译:随着基于指令微调的大型语言模型(LLMs)在全球范围内得到广泛采用,其遵循多种语言指令的能力变得日益关键。跨语言迁移是一种有前景的方法,即模型通过对某种语言的微调,获得对其他语言的特定功能。在本工作中,我们探究多语言性在多语言LLM的指令微调过程中如何影响跨语言的指令遵循能力。我们首先证明,许多语言在仅进行单语言微调时,也能将部分指令遵循能力迁移至其他语言。此外,我们发现,在英语微调集中仅加入40个多语言示例,就能显著提升多语言指令遵循能力,无论是在微调过程中见过还是未见过的语言。总体而言,我们观察到,与单语言微调模型相比,在多语言混合数据上微调的模型在多种语言上表现出相当或更优的性能,尽管在这些语言上的训练示例数量少了10倍。最后,我们发现,将指令微调集中的语言数量从1种增加到仅2、3或4种,能提升跨语言泛化能力。我们的结果表明,构建大规模多语言指令微调模型时,仅需极少量的多语言指令-响应对。