Despite significant improvements in enhancing the quality of translation, context-aware machine translation (MT) models underperform in many cases. One of the main reasons is that they fail to utilize the correct features from context when the context is too long or their models are overly complex. This can lead to the explain-away effect, wherein the models only consider features easier to explain predictions, resulting in inaccurate translations. To address this issue, we propose a model that explains the decisions made for translation by predicting coreference features in the input. We construct a model for input coreference by exploiting contextual features from both the input and translation output representations on top of an existing MT model. We evaluate and analyze our method in the WMT document-level translation task of English-German dataset, the English-Russian dataset, and the multilingual TED talk dataset, demonstrating an improvement of over 1.0 BLEU score when compared with other context-aware models.
翻译:尽管在提升翻译质量方面取得了显著进展,上下文感知机器翻译模型在诸多场景下仍表现欠佳。主要原因在于当上下文过长或模型过于复杂时,这些模型未能有效利用正确的上下文特征。这可能导致“解释消解效应”,即模型仅考虑易于解释预测的特征,从而产生不准确的翻译结果。为解决该问题,我们提出了一种通过预测输入中的共指特征来阐释翻译决策的模型。我们在现有机器翻译模型基础上,利用输入与翻译输出表征中的上下文特征构建输入共指模型。在WMT文档级翻译任务的英德数据集、英俄数据集以及多语言TED演讲数据集上的评估与分析表明,相较于其他上下文感知模型,本方法在BLEU评分上取得了超过1.0分的提升。