Neural machine translation (NMT) has become the de-facto standard in real-world machine translation applications. However, NMT models can unpredictably produce severely pathological translations, known as hallucinations, that seriously undermine user trust. It becomes thus crucial to implement effective preventive strategies to guarantee their proper functioning. In this paper, we address the problem of hallucination detection in NMT by following a simple intuition: as hallucinations are detached from the source content, they exhibit encoder-decoder attention patterns that are statistically different from those of good quality translations. We frame this problem with an optimal transport formulation and propose a fully unsupervised, plug-in detector that can be used with any attention-based NMT model. Experimental results show that our detector not only outperforms all previous model-based detectors, but is also competitive with detectors that employ large models trained on millions of samples.
翻译:神经机器翻译(NMT)已成为现实世界机器翻译应用中的事实标准。然而,NMT模型可能不可预测地产生严重异常的翻译结果(即幻觉),严重损害用户信任。因此,实施有效的预防策略以保障其正常运行变得至关重要。本文遵循一个简单直觉来应对NMT中的幻觉检测问题:由于幻觉内容与源内容脱节,其编码器-解码器注意力模式在统计上与优质翻译存在显著差异。我们采用最优传输理论框架来建模该问题,并提出一种完全无监督的即插即用型检测器,可适用于任何基于注意力机制的NMT模型。实验结果表明,我们的检测器不仅优于所有先前基于模型的检测方法,甚至能与基于百万级样本训练的大模型检测器相媲美。