Maximum a posteriori decoding, a commonly used method for neural machine translation (NMT), aims to maximize the estimated posterior probability. However, high estimated probability does not always lead to high translation quality. Minimum Bayes Risk (MBR) decoding offers an alternative by seeking hypotheses with the highest expected utility. In this work, we show that Quality Estimation (QE) reranking, which uses a QE model as a reranker, can be viewed as a variant of MBR. Inspired by this, we propose source-based MBR (sMBR) decoding, a novel approach that utilizes synthetic sources generated by backward translation as ``support hypotheses'' and a reference-free quality estimation metric as the utility function, marking the first work to solely use sources in MBR decoding. Experiments show that sMBR significantly outperforms QE reranking and is competitive with standard MBR decoding. Furthermore, sMBR calls the utility function fewer times compared to MBR. Our findings suggest that sMBR is a promising approach for high-quality NMT decoding.
翻译:最大后验概率解码是神经机器翻译中常用的方法,旨在最大化估计的后验概率。然而,高估计概率并不总能带来高翻译质量。最小贝叶斯风险解码作为一种替代方案,通过寻找具有最高期望效用的假设来实现优化。本研究中,我们证明使用质量估计模型作为重排序器的质量估计重排序可被视为MBR的一种变体。受此启发,我们提出了基于源端的MBR解码方法,这是一种新颖的方法,利用通过反向翻译生成的合成源端作为“支持假设”,并采用无参考质量估计度量作为效用函数,这是首个在MBR解码中仅使用源端的研究。实验表明,sMBR显著优于QE重排序,并与标准MBR解码具有竞争力。此外,与MBR相比,sMBR调用效用函数的次数更少。我们的研究结果表明,sMBR是实现高质量NMT解码的一种有前景的方法。