Meta-Calibration Regularized Neural Networks

Miscalibration-the mismatch between predicted probability and the true correctness likelihood-has been frequently identified in modern deep neural networks. Recent work in the field aims to address this problem by training calibrated models directly by optimizing a proxy of the calibration error alongside the conventional objective. Recently, Meta-Calibration (MC) showed the effectiveness of using meta-learning for learning better calibrated models. In this work, we extend MC with two main components: (1) gamma network (gamma-net), a meta network to learn a sample-wise gamma at a continuous space for focal loss for optimizing backbone network; (2) smooth expected calibration error (SECE), a Gaussian-kernel based unbiased and differentiable ECE which aims to smoothly optimizing gamma-net. The proposed method regularizes neural network towards better calibration meanwhile retain predictive performance. Our experiments show that (a) learning sample-wise gamma at continuous space can effectively perform calibration; (b) SECE smoothly optimise gamma-net towards better robustness to binning schemes; (c) the combination of gamma-net and SECE achieve the best calibration performance across various calibration metrics and retain very competitive predictive performance as compared to multiple recently proposed methods on three datasets.

翻译：误校准——即预测概率与真实正确可能性之间的不匹配——在现代深度神经网络中频繁出现。该领域近期研究旨在通过直接优化校准误差的代理项并结合传统目标函数来训练校准模型。近期，元校准方法展示了利用元学习训练更优校准模型的有效性。本研究从两个主要方面扩展了元校准：(1) 伽马网络——一种元网络，用于在连续空间中学习样本级伽马参数以优化焦点损失，从而优化骨干网络；(2) 平滑期望校准误差——一种基于高斯核的无偏且可微的期望校准误差，旨在平滑优化伽马网络。所提方法通过正则化神经网络实现更优校准，同时保持预测性能。实验表明：(a) 在连续空间中学习样本级伽马参数可有效实现校准；(b) 平滑期望校准误差能平滑优化伽马网络，使其对分箱方案具有更强鲁棒性；(c) 伽马网络与平滑期望校准误差的组合在三个数据集上的多种校准指标中均取得最佳校准性能，且相比近期提出的多种方法保持极具竞争力的预测性能。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日