Current research in zero-shot translation is plagued by several issues such as high compute requirements, increased training time and off target translations. Proposed remedies often come at the cost of additional data or compute requirements. Pivot based neural machine translation is preferred over a single-encoder model for most settings despite the increased training and evaluation time. In this work, we overcome the shortcomings of zero-shot translation by taking advantage of transliteration and linguistic similarity. We build a single encoder-decoder neural machine translation system for Dravidian-Dravidian multilingual translation and perform zero-shot translation. We compare the data vs zero-shot accuracy tradeoff and evaluate the performance of our vanilla method against the current state of the art pivot based method. We also test the theory that morphologically rich languages require large vocabularies by restricting the vocabulary using an optimal transport based technique. Our model manages to achieves scores within 3 BLEU of large-scale pivot-based models when it is trained on 50\% of the language directions.
翻译:当前零样本翻译研究面临计算需求高、训练时间长及目标偏离翻译等若干问题。现有解决方案常以额外数据或计算需求为代价。大多数场景下,基于枢轴神经机器翻译仍比单编码器模型更受青睐,尽管其训练和评估时间有所增加。本研究通过利用音译和语言相似性克服零样本翻译的局限性。我们构建了用于德拉维达语系内部多语言翻译的单编码器-解码器神经机器翻译系统,并实现了零样本翻译。通过对比数据量与零样本准确率的权衡关系,评估了本方法相对于当前最先进的枢轴方法的性能表现。针对形态丰富语言需要更大词汇表的理论,我们采用基于最优传输的技术进行词汇量限制并验证其效果。当仅使用50%语言方向的数据训练时,本模型的BLEU值与大规模枢轴模型相比差异在3分以内。