Beyond calibration: estimating the grouping loss of modern neural networks

The ability to ensure that a classifier gives reliable confidence scores is essential to ensure informed decision-making. To this end, recent work has focused on miscalibration, i.e., the over or under confidence of model scores. Yet calibration is not enough: even a perfectly calibrated classifier with the best possible accuracy can have confidence scores that are far from the true posterior probabilities. This is due to the grouping loss, created by samples with the same confidence scores but different true posterior probabilities. Proper scoring rule theory shows that given the calibration loss, the missing piece to characterize individual errors is the grouping loss. While there are many estimators of the calibration loss, none exists for the grouping loss in standard settings. Here, we propose an estimator to approximate the grouping loss. We show that modern neural network architectures in vision and NLP exhibit grouping loss, notably in distribution shifts settings, which highlights the importance of pre-production validation.

翻译：确保分类器能够提供可靠的置信度分数对于实现知情的决策至关重要。为此，近年来的工作主要聚焦于校准误差，即模型分数的过度自信或不足自信。然而，仅有校准还不够：即使一个完全校准且具有最佳准确率的分类器，其置信度分数也可能远低于真实的 posterior 概率。这是由于组间损失造成的，该损失源于具有相同置信度分数但不同真实 posterior 概率的样本。适当的评分规则理论表明，给定校准损失后，描述个体误差缺失的环节正是组间损失。尽管存在多种校准损失的估计方法，但在标准设置中尚无针对组间损失的估计方法。在此，我们提出一种估计量来近似组间损失。我们展示了现代视觉和自然语言处理领域的神经网络架构存在组间损失，特别是在分布偏移设置中，这凸显了投产前验证的重要性。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

专知会员服务

52+阅读 · 2022年10月22日

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

73+阅读 · 2022年7月11日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日