The capacity and effectiveness of pre-trained multilingual models (MLMs) for zero-shot cross-lingual transfer is well established. However, phenomena of positive or negative transfer, and the effect of language choice still need to be fully understood, especially in the complex setting of massively multilingual LMs. We propose an \textit{efficient} method to study transfer language influence in zero-shot performance on another target language. Unlike previous work, our approach disentangles downstream tasks from language, using dedicated adapter units. Our findings suggest that some languages do not largely affect others, while some languages, especially ones unseen during pre-training, can be extremely beneficial or detrimental for different target languages. We find that no transfer language is beneficial for all target languages. We do, curiously, observe languages previously unseen by MLMs consistently benefit from transfer from almost any language. We additionally use our modular approach to quantify negative interference efficiently and categorize languages accordingly. Furthermore, we provide a list of promising transfer-target language configurations that consistently lead to target language performance improvements. Code and data are publicly available: https://github.com/ffaisal93/neg_inf
翻译:预训练多语言模型(MLM)在零样本跨语言迁移中的能力与有效性已得到充分证实。然而,正迁移或负迁移现象以及语言选择的影响仍需深入理解,尤其是在大规模多语言语言模型(LMs)的复杂场景中。我们提出了一种\textit{高效}方法,用于研究零样本迁移语言对另一目标语言性能的影响。与现有工作不同,我们的方法通过专用适配器单元将下游任务与语言分离。研究结果表明:部分语言对其他语言影响有限,而某些语言(特别是预训练阶段未见语言)可能对特定目标语言产生极大促进或抑制作用。我们观察到,不存在对所有目标语言均有裨益的迁移语言。有趣的是,预训练阶段未被MLM见过的语言几乎总能从任意语言的迁移中获益。此外,我们利用模块化方法高效量化负干扰并据此对语言分类。最终,我们提供了一组能稳定提升目标语言性能的优迁移-目标语言配置组合。代码与数据已公开:https://github.com/ffaisal93/neg_inf