Retrieval-Augmented Machine Translation (RAMT) is attracting growing attention. This is because RAMT not only improves translation metrics, but is also assumed to implement some form of domain adaptation. In this contribution, we study another salient trait of RAMT, its ability to make translation decisions more transparent by allowing users to go back to examples that contributed to these decisions. For this, we propose a novel architecture aiming to increase this transparency. This model adapts a retrieval-augmented version of the Levenshtein Transformer and makes it amenable to simultaneously edit multiple fuzzy matches found in memory. We discuss how to perform training and inference in this model, based on multi-way alignment algorithms and imitation learning. Our experiments show that editing several examples positively impacts translation scores, notably increasing the number of target spans that are copied from existing instances.
翻译:检索增强机器翻译(RAMT)正日益受到关注。这不仅是因为RAMT能提升翻译指标,还因为它被认为能实现某种形式的领域自适应。在本研究中,我们探讨RAMT的另一显著特性——通过允许用户回溯促成翻译决策的示例,使翻译决策过程更加透明。为此,我们提出一种旨在增强这种透明性的新型架构。该模型对检索增强版本的莱文斯坦Transformer进行改造,使其能够同时编辑记忆库中发现的多个模糊匹配片段。我们基于多路对齐算法和模仿学习,阐述了该模型的训练与推理方法。实验表明,对多个示例进行编辑能正向影响翻译得分,尤其能提升从现有实例中复制的目标语段数量。