Despite their incredible performance, it is well reported that deep neural networks tend to be overoptimistic about their prediction confidence. Finding effective and efficient calibration methods for neural networks is therefore an important endeavour towards better uncertainty quantification in deep learning. In this manuscript, we introduce a novel calibration technique named expectation consistency (EC), consisting of a post-training rescaling of the last layer weights by enforcing that the average validation confidence coincides with the average proportion of correct labels. First, we show that the EC method achieves similar calibration performance to temperature scaling (TS) across different neural network architectures and data sets, all while requiring similar validation samples and computational resources. However, we argue that EC provides a principled method grounded on a Bayesian optimality principle known as the Nishimori identity. Next, we provide an asymptotic characterization of both TS and EC in a synthetic setting and show that their performance crucially depends on the target function. In particular, we discuss examples where EC significantly outperforms TS.
翻译:尽管深度神经网络性能惊人,但大量报告指出其在预测置信度上往往过于乐观。因此,寻找有效且高效的神经网络校准方法,对于提升深度学习中的不确定性量化具有重要意义。本文提出一种名为期望一致性(EC)的新型校准技术,该方法通过在验证集上强制平均置信度与平均正确标签比例保持一致,对网络最后一层权重进行后训练重缩放。首先,我们证明EC方法在不同神经网络架构和数据集上能达到与温度缩放(TS)相当的校准性能,且所需验证样本和计算资源相近。然而,我们强调EC方法是一种基于贝叶斯最优性原理(即Nishimori恒等式)的严谨方法。其次,我们通过合成实验对TS和EC进行渐近特性分析,表明两者的性能关键取决于目标函数。特别地,我们讨论了EC显著优于TS的案例。