Can Machine Translation Bridge Multilingual Pretraining and Cross-lingual Transfer Learning?

Multilingual pretraining and fine-tuning have remarkably succeeded in various natural language processing tasks. Transferring representations from one language to another is especially crucial for cross-lingual learning. One can expect machine translation objectives to be well suited to fostering such capabilities, as they involve the explicit alignment of semantically equivalent sentences from different languages. This paper investigates the potential benefits of employing machine translation as a continued training objective to enhance language representation learning, bridging multilingual pretraining and cross-lingual applications. We study this question through two lenses: a quantitative evaluation of the performance of existing models and an analysis of their latent representations. Our results show that, contrary to expectations, machine translation as the continued training fails to enhance cross-lingual representation learning in multiple cross-lingual natural language understanding tasks. We conclude that explicit sentence-level alignment in the cross-lingual scenario is detrimental to cross-lingual transfer pretraining, which has important implications for future cross-lingual transfer studies. We furthermore provide evidence through similarity measures and investigation of parameters that this lack of positive influence is due to output separability -- which we argue is of use for machine translation but detrimental elsewhere.

翻译：多语预训练与微调已在多种自然语言处理任务中取得显著成功。将表示从一种语言迁移到另一种语言对于跨语言学习尤为关键。由于机器翻译目标涉及不同语言中语义等价句子的显式对齐，因此有理由预期其能有效培养这种能力。本文探究了将机器翻译作为持续训练目标以增强语言表示学习、桥接多语预训练与跨语言应用的潜在益处。我们从两个维度展开研究：对现有模型性能的量化评估，以及对其潜在表示的分析。结果表明，与预期相反，将机器翻译作为持续训练目标并未能在多个跨语言自然语言理解任务中增强跨语言表示学习。我们得出结论：在跨语言场景中，显式的句子级对齐对跨语言迁移预训练有害，这一发现对未来跨语言迁移研究具有重要启示。此外，通过相似度度量与参数分析，我们提供的证据表明：这种积极影响的缺失源于输出可分离性——我们论证这种特性在机器翻译中具有价值，但在其他场景中却适得其反。

相关内容

Machine Translation

关注 210

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日