Despite the impressive growth of the abilities of multilingual language models, such as XLM-R and mT5, it has been shown that they still face difficulties when tackling typologically-distant languages, particularly in the low-resource setting. One obstacle for effective cross-lingual transfer is variability in word-order patterns. It can be potentially mitigated via source- or target-side word reordering, and numerous approaches to reordering have been proposed. However, they rely on language-specific rules, work on the level of POS tags, or only target the main clause, leaving subordinate clauses intact. To address these limitations, we present a new powerful reordering method, defined in terms of Universal Dependencies, that is able to learn fine-grained word-order patterns conditioned on the syntactic context from a small amount of annotated data and can be applied at all levels of the syntactic tree. We conduct experiments on a diverse set of tasks and show that our method consistently outperforms strong baselines over different language pairs and model architectures. This performance advantage holds true in both zero-shot and few-shot scenarios.
翻译:摘要:尽管多语言语言模型(如XLM-R和mT5)的能力取得了显著增长,但研究表明,它们在处理类型学距离较远的语言时仍面临困难,尤其是在低资源场景下。跨语言迁移有效性的一个障碍是词序模式的变异性。通过源语言或目标语言的词序重排可能缓解这一问题,且已有多种重排方法被提出。然而,这些方法依赖语言特定规则,基于词性标注层级操作,或仅针对主句而保留从句结构。为解决以上局限,我们提出了一种基于通用依存关系的新颖强效重排方法——该方法能够从少量标注数据中学习依赖句法上下文的细粒度词序模式,并可应用于句法树的所有层级。我们在多样化的任务集合上开展实验,结果表明,我们的方法在不同语言对和模型架构下均持续优于强基线。这一性能优势在零样本和少样本场景中均得以保持。