Recent studies have exhibited remarkable capabilities of pre-trained multilingual Transformers, especially cross-lingual transferability. However, current methods do not measure cross-lingual transferability well, hindering the understanding of multilingual Transformers. In this paper, we propose IGap, a cross-lingual transferability metric for multilingual Transformers on sentence classification tasks. IGap takes training error into consideration, and can also estimate transferability without end-task data. Experimental results show that IGap outperforms baseline metrics for transferability measuring and transfer direction ranking. Besides, we conduct extensive systematic experiments where we compare transferability among various multilingual Transformers, fine-tuning algorithms, and transfer directions. More importantly, our results reveal three findings about cross-lingual transfer, which helps us to better understand multilingual Transformers.
翻译:近期研究展示了预训练多语言Transformer的卓越能力,尤其是其跨语言迁移能力。然而,现有方法未能有效衡量这种跨语言迁移能力,阻碍了对多语言Transformer的理解。本文提出IGap——一种针对句子分类任务中多语言Transformer的跨语言迁移能力度量指标。IGap考虑了训练误差,且无需下游任务数据即可估计迁移能力。实验结果表明,IGap在迁移能力度量及迁移方向排序上优于基线指标。此外,我们开展了广泛的系统性实验,比较了不同多语言Transformer、微调算法及迁移方向下的迁移能力。更重要的是,实验结果揭示了关于跨语言迁移的三点发现,有助于我们更深入地理解多语言Transformer。