Zero-shot cross-lingual transfer is a central task in multilingual NLP, allowing models trained in languages with more sufficient training resources to generalize to other low-resource languages. Earlier efforts on this task use parallel corpora, bilingual dictionaries, or other annotated alignment data to improve cross-lingual transferability, which are typically expensive to obtain. In this paper, we propose a simple yet effective method, SALT, to improve the zero-shot cross-lingual transfer of the multilingual pretrained language models without the help of such external data. By incorporating code-switching and embedding mixup with self-augmentation, SALT effectively distills cross-lingual knowledge from the multilingual PLM and enhances its transferability on downstream tasks. Experimental results on XNLI and PAWS-X show that our method is able to improve zero-shot cross-lingual transferability without external data. Our code is available at https://github.com/luka-group/SALT.
翻译:零样本跨语言迁移是多语言自然语言处理领域的核心任务,它允许在资源更丰富的语言上训练的模型泛化到其他低资源语言。以往研究依赖平行语料、双语词典或其他标注对齐数据来提升跨语言迁移能力,但这些数据通常获取成本高昂。本文提出一种简单而有效的方法SALT,无需借助此类外部数据即可提升多语言预训练语言模型的零样本跨语言迁移能力。通过将代码混合与嵌入混合技术结合自我增强机制,SALT有效蒸馏多语言预训练语言模型中的跨语言知识,并增强其在下游任务上的迁移性能。在XNLI和PAWS-X数据集上的实验结果表明,我们的方法无需外部数据即可提升零样本跨语言迁移能力。我们的代码开源在https://github.com/luka-group/SALT。