Recalibrating probabilistic classifiers is vital for enhancing the reliability and accuracy of predictive models. Despite the development of numerous recalibration algorithms, there is still a lack of a comprehensive theory that integrates calibration and sharpness (which is essential for maintaining predictive power). In this paper, we introduce the concept of minimum-risk recalibration within the framework of mean-squared-error (MSE) decomposition, offering a principled approach for evaluating and recalibrating probabilistic classifiers. Using this framework, we analyze the uniform-mass binning (UMB) recalibration method and establish a finite-sample risk upper bound of order $\tilde{O}(B/n + 1/B^2)$ where $B$ is the number of bins and $n$ is the sample size. By balancing calibration and sharpness, we further determine that the optimal number of bins for UMB scales with $n^{1/3}$, resulting in a risk bound of approximately $O(n^{-2/3})$. Additionally, we tackle the challenge of label shift by proposing a two-stage approach that adjusts the recalibration function using limited labeled data from the target domain. Our results show that transferring a calibrated classifier requires significantly fewer target samples compared to recalibrating from scratch. We validate our theoretical findings through numerical simulations, which confirm the tightness of the proposed bounds, the optimal number of bins, and the effectiveness of label shift adaptation.
翻译:重校准概率分类器对于提升预测模型的可靠性与准确性至关重要。尽管已有多种重校准算法被提出,但整合校准性与锐度(这对维持预测能力至关重要)的综合理论仍然缺失。本文在均方误差(MSE)分解框架下引入最小风险重校准概念,为评估与重校准概率分类器提供了原则性方法。利用该框架,我们分析了均匀质量分箱(UMB)重校准方法,建立了阶数为 $\tilde{O}(B/n + 1/B^2)$ 的有限样本风险上界,其中 $B$ 为分箱数,$n$ 为样本量。通过平衡校准性与锐度,我们进一步确定 UMB 的最优分箱数随 $n^{1/3}$ 缩放,得到约 $O(n^{-2/3})$ 的风险界。此外,为应对标签偏移挑战,我们提出两阶段方法,利用目标域少量标注数据调整重校准函数。结果表明,相较于从头重校准,迁移已校准分类器所需的目标样本显著更少。通过数值模拟验证理论发现,结果证实了所提界的紧致性、最优分箱数以及标签偏移自适应方法的有效性。