In recent years, pre-trained Multilingual Language Models (MLLMs) have shown a strong ability to transfer knowledge across different languages. However, given that the aspiration for such an ability has not been explicitly incorporated in the design of the majority of MLLMs, it is challenging to obtain a unique and straightforward explanation for its emergence. In this review paper, we survey literature that investigates different factors contributing to the capacity of MLLMs to perform zero-shot cross-lingual transfer and subsequently outline and discuss these factors in detail. To enhance the structure of this review and to facilitate consolidation with future studies, we identify five categories of such factors. In addition to providing a summary of empirical evidence from past studies, we identify consensuses among studies with consistent findings and resolve conflicts among contradictory ones. Our work contextualizes and unifies existing research streams which aim at explaining the cross-lingual potential of MLLMs. This review provides, first, an aligned reference point for future research and, second, guidance for a better-informed and more efficient way of leveraging the cross-lingual capacity of MLLMs.
翻译:近年来,预训练多语言模型(MLLMs)展现出强大的跨语言知识迁移能力。然而,由于大多数MLLMs的设计并未明确纳入对这种能力的追求,因此难以获得对其涌现机制的唯一且直接的解释。在这篇综述中,我们调研了探究影响MLLMs执行零样本跨语言迁移能力的各类因素的相关文献,并随后详细概述和讨论了这些因素。为增强本综述的结构性并便于与未来研究整合,我们确定了五类此类因素。除了总结过往研究的实证证据外,我们识别了具有一致发现的研究间的共识,并调合了相互矛盾的结论。我们的工作将旨在解释MLLMs跨语言潜力的现有研究方向加以情境化与统一化。本综述首先为未来研究提供了对齐的参考点,其次为指导更明智、更有效地利用MLLMs跨语言能力提供了依据。