This paper presents a new approach for assessing uncertainty in machine translation by simultaneously evaluating translation quality and providing a reliable confidence score. Our approach utilizes conformal predictive distributions to produce prediction intervals with guaranteed coverage, meaning that for any given significance level $\epsilon$, we can expect the true quality score of a translation to fall out of the interval at a rate of $1-\epsilon$. In this paper, we demonstrate how our method outperforms a simple, but effective baseline on six different language pairs in terms of coverage and sharpness. Furthermore, we validate that our approach requires the data exchangeability assumption to hold for optimal performance.
翻译:本文提出一种评估机器翻译不确定性的新方法,该方法通过同步评估翻译质量并提供可靠的置信度分数。我们的方法利用共形预测分布生成具有保证覆盖率的预测区间,这意味着对于任意给定的显著性水平$\epsilon$,翻译的真实质量分数以$1-\epsilon$的概率落在区间之外。在本文中,我们展示了该方法在覆盖率和锐度两方面优于六种不同语言对上的简单有效基线。此外,我们验证了该方法需要满足数据可交换性假设才能获得最优性能。