We argue that translation quality alone is not a sufficient metric for measuring knowledge transfer in multilingual neural machine translation. To support this claim, we introduce Representational Transfer Potential (RTP), which measures representational similarities between languages. We show that RTP can measure both positive and negative transfer (interference), and find that RTP is strongly correlated with changes in translation quality, indicating that transfer does occur. Furthermore, we investigate data and language characteristics that are relevant for transfer, and find that multi-parallel overlap is an important yet under-explored feature. Based on this, we develop a novel training scheme, which uses an auxiliary similarity loss that encourages representations to be more invariant across languages by taking advantage of multi-parallel data. We show that our method yields increased translation quality for low- and mid-resource languages across multiple data and model setups.
翻译:我们认为,仅凭翻译质量不足以衡量多语言神经机器翻译中的知识迁移。为支持这一观点,我们引入了表征迁移潜力(RTP),用于衡量语言间的表征相似性。我们证明RTP既能衡量正向迁移也能衡量负向迁移(干扰),并发现RTP与翻译质量变化高度相关,表明迁移确实发生。此外,我们探究了与迁移相关的数据和语言特征,发现多平行语料重叠是一个重要但尚未充分探索的特征。基于此,我们开发了一种新的训练方案,该方案利用辅助相似性损失,通过多平行数据使表征在语言间更具不变性。实验表明,我们的方法在多种数据和模型设置下,均能提升中低资源语言的翻译质量。