Parameter efficient finetuning has emerged as a viable solution for improving the performance of Large Language Models without requiring massive resources and compute. Prior work on multilingual evaluation has shown that there is a large gap between the performance of LLMs on English and other languages. Further, there is also a large gap between the performance of smaller open-source models and larger LLMs. Finetuning can be an effective way to bridge this gap and make language models more equitable. In this work, we finetune the LLaMA-7B and Mistral-7B models on synthetic multilingual instruction tuning data to determine its effect on model performance on five downstream tasks covering twenty three languages in all. Additionally, we experiment with various parameters, such as rank for low-rank adaptation and values of quantisation to determine their effects on downstream performance and find that higher rank and higher quantisation values benefit low-resource languages. We find that parameter efficient finetuning of smaller open source models sometimes bridges the gap between the performance of these models and the larger ones, however, English performance can take a hit. We also find that finetuning sometimes improves performance on low-resource languages, while degrading performance on high-resource languages.
翻译:参数高效微调已成为一种可行方案,能够在无需海量资源与计算的前提下提升大型语言模型的性能。此前关于多语言评估的研究表明,大型语言模型在英语与其他语言上的表现存在显著差距,同时小型开源模型与大型LLM间的性能差距同样悬殊。微调可有效弥合这一差距,推动语言模型的公平性。本研究对LLaMA-7B与Mistral-7B模型进行合成多语言指令调优数据上的微调,以探究其对五个下游任务(涵盖共23种语言)性能的影响。此外,我们通过调整低秩适应中的秩参数与量化配置,分析其对下游性能的影响,发现更高的秩与量化值对低资源语言更有利。实验表明,对小型开源模型进行参数高效微调有时能缩小其与大型模型间的性能差距,但英语性能可能因此受损。我们还发现,微调在提升低资源语言性能的同时,有时会降低高资源语言的表现。