Despite their incredible performance, it is well reported that deep neural networks tend to be overoptimistic about their prediction confidence. Finding effective and efficient calibration methods for neural networks is therefore an important endeavour towards better uncertainty quantification in deep learning. In this manuscript, we introduce a novel calibration technique named expectation consistency (EC), consisting of a post-training rescaling of the last layer weights by enforcing that the average validation confidence coincides with the average proportion of correct labels. First, we show that the EC method achieves similar calibration performance to temperature scaling (TS) across different neural network architectures and data sets, all while requiring similar validation samples and computational resources. However, we argue that EC provides a principled method grounded on a Bayesian optimality principle known as the Nishimori identity. Next, we provide an asymptotic characterization of both TS and EC in a synthetic setting and show that their performance crucially depends on the target function. In particular, we discuss examples where EC significantly outperforms TS.
翻译:尽管深度神经网络表现出惊人的性能,但已有充分报道指出,它们往往对其预测置信度过于乐观。因此,寻找有效且高效的神经网络校准方法,对于在深度学习中实现更好的不确定性量化是一项重要任务。本文提出一种名为期望一致性(EC)的新型校准技术,该方法通过强制验证集平均置信度与正确标签的平均比例一致,对最后一层权重进行训练后重新缩放。首先,我们证明EC方法在不同神经网络架构和数据集上实现了与温度缩放(TS)相似的校准性能,同时所需验证样本和计算资源相当。然而,我们强调EC提供了一种基于贝叶斯最优性原理(即Nishimori恒等式)的规范化方法。接着,我们在合成场景下对TS和EC进行渐近特征刻画,并表明其性能关键取决于目标函数。特别地,我们讨论了EC显著优于TS的若干示例。