Cross-lingual transfer (XLT) is an emergent ability of multilingual language models that preserves their performance on a task to a significant extent when evaluated in languages that were not included in the fine-tuning process. While English, due to its widespread usage, is typically regarded as the primary language for model adaption in various tasks, recent studies have revealed that the efficacy of XLT can be amplified by selecting the most appropriate source languages based on specific conditions. In this work, we propose the utilization of sub-network similarity between two languages as a proxy for predicting the compatibility of the languages in the context of XLT. Our approach is model-oriented, better reflecting the inner workings of foundation models. In addition, it requires only a moderate amount of raw text from candidate languages, distinguishing it from the majority of previous methods that rely on external resources. In experiments, we demonstrate that our method is more effective than baselines across diverse tasks. Specifically, it shows proficiency in ranking candidates for zero-shot XLT, achieving an improvement of 4.6% on average in terms of NDCG@3. We also provide extensive analyses that confirm the utility of sub-networks for XLT prediction.
翻译:跨语言迁移(XLT)是多语言语言模型的一种涌现能力,使其在微调过程中未包含的语言上评估时,能显著保持任务性能。由于英语使用广泛,通常被视为各类任务中模型适配的主要语言,但近期研究表明,通过根据特定条件选择最合适的源语言,可以增强XLT的效果。本文提出利用两种语言之间的子网络相似性作为代理指标,预测语言在XLT场景中的兼容性。我们的方法以模型为导向,能更好反映基础模型的内部运作机制。此外,该方法仅需候选语言的适量原始文本,区别于多数依赖外部资源的现有方法。实验表明,本方法在各类任务中均优于基线方法,具体而言,在零样本XLT候选语言排序中表现优异,NDCG@3指标平均提升4.6%。我们还通过大量分析验证了子网络用于XLT预测的有效性。