K-Nearest Neighbor Neural Machine Translation (kNN-MT) successfully incorporates external corpus by retrieving word-level representations at test time. Generally, kNN-MT borrows the off-the-shelf context representation in the translation task, e.g., the output of the last decoder layer, as the query vector of the retrieval task. In this work, we highlight that coupling the representations of these two tasks is sub-optimal for fine-grained retrieval. To alleviate it, we leverage supervised contrastive learning to learn the distinctive retrieval representation derived from the original context representation. We also propose a fast and effective approach to constructing hard negative samples. Experimental results on five domains show that our approach improves the retrieval accuracy and BLEU score compared to vanilla kNN-MT.
翻译:K-最近邻神经机器翻译(kNN-MT)通过在测试时检索词级表示,成功融入了外部语料库。通常,kNN-MT借用翻译任务中现成的上下文表示(例如最后一个解码器层的输出)作为检索任务的查询向量。在本工作中,我们强调耦合这两种任务的表示对于细粒度检索而言并非最优。为解决该问题,我们利用监督对比学习,从原始上下文表示中学习独特的检索表示。我们还提出了一种快速且有效的硬负样本构建方法。在五个领域上的实验结果表明,与原始kNN-MT相比,我们的方法提高了检索准确率和BLEU分数。