Maximum a posteriori decoding, a commonly used method for neural machine translation (NMT), aims to maximize the estimated posterior probability. However, high estimated probability does not always lead to high translation quality. Minimum Bayes Risk (MBR) decoding (\citealp{kumar2004minimum}) offers an alternative by seeking hypotheses with the highest expected utility. Inspired by Quality Estimation (QE) reranking which uses the QE model as a ranker (\citealp{fernandes-etal-2022-quality}), we propose source-based MBR (sMBR) decoding, a novel approach that utilizes quasi-sources (generated via paraphrasing or back-translation) as ``support hypotheses'' and a reference-free quality estimation metric as the utility function, marking the first work to solely use sources in MBR decoding. Experiments show that sMBR outperforms QE reranking and the standard MBR decoding. Our findings suggest that sMBR is a promising approach for NMT decoding.
翻译:最大后验概率解码是神经机器翻译中常用的方法,旨在最大化估计的后验概率。然而,高估计概率并不总能带来高翻译质量。最小贝叶斯风险解码通过寻找具有最高期望效用的假设,提供了一种替代方案。受质量估计重排序的启发,我们提出了一种基于源端的最小贝叶斯风险解码方法。该新颖方法利用通过释义或回译生成的“准源端”作为“支持假设”,并采用无参考质量估计指标作为效用函数,这是首个在最小贝叶斯风险解码中仅使用源端信息的工作。实验表明,该方法在性能上优于质量估计重排序和标准的最小贝叶斯风险解码。我们的研究结果表明,基于源端的最小贝叶斯风险解码是神经机器翻译解码中一种有前景的方法。