On Temperature Scaling and Conformal Prediction of Deep Classifiers

In many classification applications, the prediction of a deep neural network (DNN) based classifier needs to be accompanied by some confidence indication. Two popular approaches for that aim are: 1) Calibration: modifies the classifier's softmax values such that the maximal value better estimates the correctness probability; and 2) Conformal Prediction (CP): produces a prediction set of candidate labels that contains the true label with a user-specified probability, guaranteeing marginal coverage, rather than, e.g., per class coverage. In practice, both types of indications are desirable, yet, so far the interplay between them has not been investigated. We start this paper with an extensive empirical study of the effect of the popular Temperature Scaling (TS) calibration on prominent CP methods and reveal that while it improves the class-conditional coverage of adaptive CP methods, surprisingly, it negatively affects their prediction set sizes. Subsequently, we explore the effect of TS beyond its calibration application and offer simple guidelines for practitioners to trade prediction set size and conditional coverage of adaptive CP methods while effectively combining them with calibration. Finally, we present a theoretical analysis of the effect of TS on the prediction set sizes, revealing several mathematical properties of the procedure, according to which we provide reasoning for this unintuitive phenomenon.

翻译：在许多分类应用中，基于深度神经网络（DNN）的分类器预测需要附带某种置信度指示。为此，两种主流方法是：1）校准：修改分类器的softmax值，使其最大值能更好地估计正确概率；2）保形预测（CP）：生成一个候选标签的预测集合，该集合以用户指定的概率包含真实标签，保证边际覆盖而非（例如）每类覆盖。实践中，这两种指示均属需要，然而迄今为止，二者之间的相互作用尚未得到研究。本文首先通过大量实证研究，探讨了常用的温度缩放（TS）校准对主流CP方法的影响，发现虽然它能提升自适应CP方法的类条件覆盖度，但出人意料的是，它会对预测集合的大小产生负面影响。随后，我们探索了TS超越其校准应用之外的效应，并为实践者提供了简单指导，以权衡自适应CP方法的预测集合大小与条件覆盖度，同时有效地将其与校准相结合。最后，我们对TS对预测集合大小的影响进行了理论分析，揭示了该过程的若干数学性质，并据此对这一反直觉现象提供了理论解释。

相关内容

关注 1

这是第25届年度会议，讨论有约束计算的所有方面，包括理论、算法、环境、语言、模型、系统和应用，如决策、资源分配、调度、配置和规划。为了纪念25周年，吉恩·弗洛伊德创作了一本“虚拟卷”来庆祝这个系列会议。信息可以在这里找到。约束编程协会有本系列中以前的会议列表。CP 2019计划将包括展示关于约束技术的高质量科学论文。除了通常的技术轨道外，CP 2019年会议还将有主题轨道。每个赛道都有一个专门的小组委员会，以确保有能力的评审员将审查这些领域的人提交的论文。官网链接：https://cp2019.a4cp.org/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日