We argue that translation quality alone is not a sufficient metric for measuring knowledge transfer in multilingual neural machine translation. To support this claim, we introduce Representational Transfer Potential (RTP), which measures representational similarities between languages. We show that RTP can measure both positive and negative transfer (interference), and find that RTP is strongly correlated with changes in translation quality, indicating that transfer does occur. Furthermore, we investigate data and language characteristics that are relevant for transfer, and find that multi-parallel overlap is an important yet under-explored feature. Based on this, we develop a novel training scheme, which uses an auxiliary similarity loss that encourages representations to be more invariant across languages by taking advantage of multi-parallel data. We show that our method yields increased translation quality for low- and mid-resource languages across multiple data and model setups.
翻译:我们认为,仅凭翻译质量不足以衡量多语言神经机器翻译中的知识迁移效果。为支撑这一观点,我们提出了表征迁移潜力(RTP),用于衡量语言之间的表征相似性。研究表明,RTP既能衡量正向迁移也能衡量负向迁移(干扰),且与翻译质量变化存在强相关,证实了迁移现象的存在。进一步地,我们探究了与迁移相关的数据和语言特征,发现多平行重叠是一个重要但尚未充分探索的要素。基于此,我们设计了一种新型训练方案,通过利用多平行数据引入辅助相似性损失,促使不同语言的表征更具不变性。实验表明,该方法在多种数据和模型配置下,显著提升了低资源和中资源语言的翻译质量。