Large language models (LLMs) that are tuned with instructions have demonstrated remarkable capabilities in various tasks and languages. However, their ability to generalize to underrepresented languages is limited due to the scarcity of available data. Additionally, directly adapting new languages to instruction-tuned LLMs can result in catastrophic forgetting, which leads to the loss of multitasking ability. To address this issue, we propose InstructAlign which uses continual crosslingual instruction tuning to enable LLMs to align new unseen languages with previously learned high-resource languages. Our results demonstrate the effectiveness of InstructAlign in enabling the model to understand low-resource languages with limited parallel data while preventing catastrophic forgetting. Our work contributes to the advancement of language adaptation methods, particularly for adapting instruction-tuned LLMs to underrepresented languages. Our code is released on https://github.com/HLTCHKUST/InstructAlign
翻译:经指令调优的大型语言模型(LLMs)在各类任务和语言中展现出卓越能力。然而,由于可用数据稀缺,这些模型对低资源语言的泛化能力仍十分有限。此外,直接将新语言适配到指令调优后的LLMs可能导致灾难性遗忘,进而丧失多任务处理能力。为解决这一问题,我们提出InstructAlign方法,通过持续跨语言指令调优,使LLMs能够将未见过的低资源语言与先前学习的高资源语言进行对齐。实验结果表明,InstructAlign能有效使模型在有限平行数据下理解低资源语言,同时防止灾难性遗忘。本研究推动了语言适配方法的发展,特别是为指令调优LLMs适配至低资源语言领域提供了创新方案。相关代码已发布于 https://github.com/HLTCHKUST/InstructAlign