The distance from calibration, introduced by Błasiok, Gopalan, Hu, and Nakkiran (STOC 2023), has recently emerged as a central measure of miscalibration for probabilistic predictors. We study the fundamental problems of computing and estimating this quantity, given either an exact description of the data distribution or only sample access to it. We give an efficient algorithm that exactly computes the calibration distance when the distribution has a uniform marginal and noiseless labels, which improves the $O(1/\sqrt{|\mathcal{X}|})$ additive approximation of Qiao and Zheng (COLT 2024) for this special case. Perhaps surprisingly, the problem becomes $\mathsf{NP}$-hard when either of the two assumptions is removed. We extend our algorithm to a polynomial-time approximation scheme for the general case. For the estimation problem, we show that $Θ(1/ε^3)$ samples are sufficient and necessary for the empirical calibration distance to be upper bounded by the true distance plus $ε$. In contrast, a polynomial dependence on the domain size -- incurred by the learning-based baseline -- is unavoidable for two-sided estimation. Our positive results are based on simple sparsifications of both the distribution and the target predictor, which significantly reduce the search space for computation and lead to stronger concentration for the estimation problem. To prove the hardness results, we introduce new techniques for certifying lower bounds on the calibration distance -- a problem that is hard in general due to its $\textsf{co-NP}$-completeness.
翻译:Błasiok、Gopalan、Hu 和 Nakkiran (STOC 2023) 提出的校准距离,近期已成为衡量概率预测器误校准程度的核心指标。我们研究了在给定数据分布的精确描述或仅通过样本访问时,计算与估计该量的基本问题。当分布具有均匀边际且标签无噪声时,我们给出了一种高效算法,能精确计算校准距离,这改进了 Qiao 和 Zheng (COLT 2024) 针对此特例的 $O(1/\sqrt{|\mathcal{X}|})$ 加法近似。令人惊讶的是,若去掉这两个假设中的任意一个,问题会变为 $\mathsf{NP}$-困难。我们将算法推广到一般情况下的多项式时间近似方案。对于估计问题,我们证明 $Θ(1/ε^3)$ 个样本足以且必需使经验校准距离被真实距离加上 $ε$ 所上界。相比之下,基于学习的基线方法所招致的对域大小的多项式依赖性,对于双侧估计是不可避免的。我们的正面结果基于分布与目标预测器的简单稀疏化,这显著减少了计算搜索空间,并增强了估计问题的集中性。为证明困难性结果,我们引入了认证校准距离下界的新技术——由于问题的 $\textsf{co-NP}$-完全性,这通常是困难的。