Nonparametric estimation with binning is widely employed in the calibration error evaluation and the recalibration of machine learning models. Recently, theoretical analyses of the bias induced by this estimation approach have been actively pursued; however, the understanding of the generalization of the calibration error to unknown data remains limited. In addition, although many recalibration algorithms have been proposed, their generalization performance lacks theoretical guarantees. To address this problem, we conduct a generalization analysis of the calibration error under the probably approximately correct (PAC) Bayes framework. This approach enables us to derive a first optimizable upper bound for the generalization error in the calibration context. We then propose a generalization-aware recalibration algorithm based on our generalization theory. Numerical experiments show that our algorithm improves the Gaussian-process-based recalibration performance on various benchmark datasets and models.
翻译:分箱非参数估计被广泛应用于机器学习模型的校准误差评估与重校准中。近期,针对该估计方法所引入偏差的理论分析已得到积极探讨;然而,对于校准误差在未知数据上泛化性能的理解仍然有限。此外,尽管已有多种重校准算法被提出,但其泛化性能缺乏理论保证。为解决此问题,我们在近似正确概率(PAC)贝叶斯框架下对校准误差进行了泛化分析。该方法使我们能够推导出校准场景中首个可优化的泛化误差上界。基于此泛化理论,我们提出了一种具备泛化感知能力的重校准算法。数值实验表明,我们的算法在多种基准数据集和模型上提升了基于高斯过程的重校准性能。