This paper addresses the problem of selective classification for deep neural networks, where a model is allowed to abstain from low-confidence predictions to avoid potential errors. We focus on so-called post-hoc methods, which replace the confidence estimator of a given classifier without modifying or retraining it, thus being practically appealing. Considering neural networks with softmax outputs, our goal is to identify the best confidence estimator that can be computed directly from the unnormalized logits. This problem is motivated by the intriguing observation in recent work that many classifiers appear to have a "broken" confidence estimator, in the sense that their selective classification performance is much worse than what could be expected by their corresponding accuracies. We perform an extensive experimental study of many existing and proposed confidence estimators applied to 84 pretrained ImageNet classifiers available from popular repositories. Our results show that a simple $p$-norm normalization of the logits, followed by taking the maximum logit as the confidence estimator, can lead to considerable gains in selective classification performance, completely fixing the pathological behavior observed in many classifiers. As a consequence, the selective classification performance of any classifier becomes almost entirely determined by its corresponding accuracy. Moreover, these results are shown to be consistent under distribution shift. Our code is available at https://github.com/lfpc/FixSelectiveClassification.
翻译:本文研究深度神经网络的选择性分类问题,即允许模型在低置信度预测时弃权以避免潜在错误。我们聚焦于所谓的事后方法,该方法无需修改或重新训练给定分类器即可替换其置信度估计器,因而具有实际吸引力。针对具有softmax输出的神经网络,我们的目标是找到可直接从未归一化logits计算的最佳置信度估计器。该问题的提出源于近期工作中一个引人关注的观察:许多分类器似乎存在“失效”的置信度估计器,表现为其选择性分类性能远低于相应准确率所预期的水平。我们对应用于84个预训练ImageNet分类器(来自主流开源库)的多种现有及新提出的置信度估计器进行了大规模实验研究。结果表明,对logits进行简单的$p$范数归一化后取最大logit值作为置信度估计器,能够显著提升选择性分类性能,完全修复在许多分类器中观察到的病态行为。因此,任何分类器的选择性分类性能几乎完全由其相应准确率决定。此外,这些结果在分布偏移条件下保持一致性。我们的代码公开于https://github.com/lfpc/FixSelectiveClassification。