Minimum Bayes-Risk (MBR) decoding is shown to be a powerful alternative to beam search decoding for a wide range of text generation tasks. However, MBR requires a huge amount of time for inference to compute the MBR objective, which makes the method infeasible in many situations where response time is critical. Confidence-based pruning (CBP) (Cheng and Vlachos, 2023) has recently been proposed to reduce the inference time in machine translation tasks. Although it is shown to significantly reduce the amount of computation, it requires hyperparameter tuning using a development set to be effective. To this end, we propose Approximate Minimum Bayes-Risk (AMBR) decoding, a hyperparameter-free method to run MBR decoding approximately. AMBR is derived from the observation that the problem of computing the sample-based MBR objective is the medoid identification problem. AMBR uses the Correlated Sequential Halving (CSH) algorithm (Baharav and Tse, 2019), the best approximation algorithm to date for the medoid identification problem, to compute the sample-based MBR objective. We evaluate AMBR on machine translation, text summarization, and image captioning tasks. The results show that AMBR achieves on par with CBP, with CBP selecting hyperparameters through an Oracle for each given computation budget.
翻译:最小贝叶斯风险(MBR)解码被证明是波束搜索解码在广泛文本生成任务中的强有力替代方案。然而,MBR需要大量推理时间来计算MBR目标函数,这使得该方法在响应时间至关重要的许多场景中不可行。近期提出的置信度剪枝(CBP)方法(Cheng和Vlachos,2023)旨在减少机器翻译任务中的推理时间。尽管该方法能显著降低计算量,但其有效性依赖于使用开发集进行超参数调优。为此,我们提出近似最小贝叶斯风险(AMBR)解码——一种无需超参数即可近似运行MBR解码的方法。AMBR基于以下观察推导得出:基于样本的MBR目标函数计算问题本质上是中位数识别问题。AMBR采用当前中位数识别问题的最优近似算法——相关连续减半(CSH)算法(Baharav和Tse,2019)来计算基于样本的MBR目标函数。我们在机器翻译、文本摘要和图像描述任务上评估了AMBR。结果表明,在CBP通过Oracle为每个给定计算预算选择超参数的情况下,AMBR的性能与CBP相当。