Multi-Head Multi-Loss Model Calibration

Delivering meaningful uncertainty estimates is essential for a successful deployment of machine learning models in the clinical practice. A central aspect of uncertainty quantification is the ability of a model to return predictions that are well-aligned with the actual probability of the model being correct, also known as model calibration. Although many methods have been proposed to improve calibration, no technique can match the simple, but expensive approach of training an ensemble of deep neural networks. In this paper we introduce a form of simplified ensembling that bypasses the costly training and inference of deep ensembles, yet it keeps its calibration capabilities. The idea is to replace the common linear classifier at the end of a network by a set of heads that are supervised with different loss functions to enforce diversity on their predictions. Specifically, each head is trained to minimize a weighted Cross-Entropy loss, but the weights are different among the different branches. We show that the resulting averaged predictions can achieve excellent calibration without sacrificing accuracy in two challenging datasets for histopathological and endoscopic image classification. Our experiments indicate that Multi-Head Multi-Loss classifiers are inherently well-calibrated, outperforming other recent calibration techniques and even challenging Deep Ensembles' performance. Code to reproduce our experiments can be found at \url{https://github.com/agaldran/mhml_calibration} .

翻译：在临床实践中，为机器学习模型提供有意义的预测不确定性估计是成功部署的关键。不确定性量化的核心在于模型输出的预测概率与实际正确概率之间的良好匹配，即模型校准。尽管已有多种方法被提出用于改善校准效果，但目前尚无技术能媲美训练深度神经网络集成这一简单却代价高昂的方法。本文提出一种简化集成形式，既避免了深度集成的昂贵训练与推理成本，又保留了其校准能力。其核心思想是将网络末端的常规线性分类器替换为一组多头结构，并通过不同损失函数监督各头训练以强制预测结果的多样性。具体而言，每个头在训练中均最小化加权交叉熵损失，但不同分支的权重存在差异。实验表明，在组织病理学与内窥镜图像分类两个具有挑战性的数据集上，该方法的平均预测结果能在不牺牲准确率的前提下实现卓越的校准性能。我们的结果显示，多头多损失分类器天然具有良好校准特性，且优于其他近期校准技术，甚至能与深度集成方法相抗衡。实验复现代码可在 \url{https://github.com/agaldran/mhml_calibration} 获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日