Accurate probabilistic predictions are essential for optimal decision making. While neural network miscalibration has been studied primarily in classification, we investigate this in the less-explored domain of regression. We conduct the largest empirical study to date to assess the probabilistic calibration of neural networks. We also analyze the performance of recalibration, conformal, and regularization methods to enhance probabilistic calibration. Additionally, we introduce novel differentiable recalibration and regularization methods, uncovering new insights into their effectiveness. Our findings reveal that regularization methods offer a favorable tradeoff between calibration and sharpness. Post-hoc methods exhibit superior probabilistic calibration, which we attribute to the finite-sample coverage guarantee of conformal prediction. Furthermore, we demonstrate that quantile recalibration can be considered as a specific case of conformal prediction. Our study is fully reproducible and implemented in a common code base for fair comparisons.
翻译:准确的概率预测对于最优决策至关重要。尽管神经网络误校准问题已在分类任务中得到广泛研究,但我们针对探索较少的回归领域展开了相关研究。我们进行了迄今为止最大规模的实证研究,以评估神经网络的概率校准性能。同时,我们分析了重校准、共形预测及正则化方法增强概率校准的效果。此外,我们提出了新型可微分重校准与正则化方法,揭示了其有效性的新见解。研究发现,正则化方法在校准与锐度之间提供了有利的权衡,而事后方法的概率校准表现更优——这归因于共形预测的有限样本覆盖保证。我们还证明分位数重校准可被视为共形预测的特例。本研究完全可复现,并基于统一代码库实现以确保公平比较。