Model miscalibration has been frequently identified in modern deep neural networks. Recent work aims to improve model calibration directly through a differentiable calibration proxy. However, the calibration produced is often biased due to the binning mechanism. In this work, we propose to learn better-calibrated models via meta-regularization, which has two components: (1) gamma network (gamma-net), a meta learner that outputs sample-wise gamma values (continuous variable) for Focal loss for regularizing the backbone network; (2) smooth expected calibration error (SECE), a Gaussian-kernel based, unbiased, and differentiable surrogate to ECE that enables the smooth optimization of gamma-Net. We evaluate the effectiveness of the proposed approach in regularizing neural networks towards improved and unbiased calibration on three computer vision datasets. We empirically demonstrate that: (a) learning sample-wise gamma as continuous variables can effectively improve calibration; (b) SECE smoothly optimizes gamma-net towards unbiased and robust calibration with respect to the binning schemes; and (c) the combination of gamma-net and SECE achieves the best calibration performance across various calibration metrics while retaining very competitive predictive performance as compared to multiple recently proposed methods.
翻译:在现代深度神经网络中,模型校准失准问题频繁出现。近期研究旨在通过可微的校准代理直接改善模型校准。然而,由于分箱机制的影响,所产生的校准结果往往存在偏差。本研究提出通过元正则化学习校准更优的模型,该方法包含两个组成部分:(1) gamma网络(gamma-net):一个元学习器,可为Focal损失输出样本级的gamma值(连续变量),用于正则化骨干网络;(2) 平滑期望校准误差(SECE):一种基于高斯核的无偏且可微的ECE替代指标,可实现gamma-Net的平滑优化。我们在三个计算机视觉数据集上评估了所提方法在正则化神经网络以实现改进且无偏校准方面的有效性。我们通过实证表明:(a) 将样本级gamma作为连续变量学习可有效改善校准;(b) SECE能平滑地优化gamma-net,使其相对于分箱方案实现无偏且稳健的校准;(c) 与近期提出的多种方法相比,gamma-net与SECE的组合在各种校准指标上均达到最佳校准性能,同时保持了极具竞争力的预测性能。