As instruction-tuned large language models (LLMs) gain global adoption, their ability to follow instructions in multiple languages becomes increasingly crucial. One promising approach is cross-lingual transfer, where a model acquires specific functionality on some language by finetuning on another language. In this work, we investigate how multilinguality during instruction tuning of a multilingual LLM affects instruction-following across languages. We first show that many languages transfer some instruction-following capabilities to other languages from even monolingual tuning. Furthermore, we find that only 40 multilingual examples in an English tuning set substantially improve multilingual instruction-following, both in seen and unseen languages during tuning. In general, we observe that models tuned on multilingual mixtures exhibit comparable or superior performance in several languages compared to monolingually tuned models, despite training on 10x fewer examples in those languages. Finally, we find that increasing the number of languages in the instruction tuning set from 1 to only 2, 3, or 4 increases cross-lingual generalization. Our results suggest that building massively multilingual instruction-tuned models can be done with only a very small set of multilingual instruction-responses.
翻译:随着基于指令微调的大型语言模型(LLMs)在全球范围内被广泛采用,它们以多种语言遵循指令的能力变得日益重要。跨语言迁移是一种有前景的方法,即模型通过在一种语言上进行微调来获取针对其他语言的特定功能。本研究探讨了在对多语言LLM进行指令微调过程中,多语言性如何影响模型跨语言的指令遵循能力。我们首先证明,即使在单语言微调中,许多语言也能将部分指令遵循能力迁移至其他语言。此外,我们发现,在英语微调数据集中仅加入40个多语言样本,即可显著提升多语言指令遵循能力——无论是对于微调过程中已见的语言还是未见语言。总体而言,我们观察到,尽管在多语言混合数据上训练的模型在特定语言上的训练样本数量比单语言微调模型少10倍,但它们在多种语言上的表现却相当或更优。最后,我们发现,将指令微调数据集中的语言数量从1种增加至仅2、3或4种,即可提升跨语言泛化能力。我们的研究结果表明,构建超多语言指令微调模型仅需极少量多语言指令-响应对。