This paper addresses the problem of selective classification for deep neural networks, where a model is allowed to abstain from low-confidence predictions to avoid potential errors. We focus on so-called post-hoc methods, which replace the confidence estimator of a given classifier without modifying or retraining it, thus being practically appealing. Considering neural networks with softmax outputs, our goal is to identify the best confidence estimator that can be computed directly from the unnormalized logits. This problem is motivated by the intriguing observation in recent work that many classifiers appear to have a "broken" confidence estimator, in the sense that their selective classification performance is much worse than what could be expected by their corresponding accuracies. We perform an extensive experimental study of many existing and proposed confidence estimators applied to 84 pretrained ImageNet classifiers available from popular repositories. Our results show that a simple $p$-norm normalization of the logits, followed by taking the maximum logit as the confidence estimator, can lead to considerable gains in selective classification performance, completely fixing the pathological behavior observed in many classifiers. As a consequence, the selective classification performance of any classifier becomes almost entirely determined by its corresponding accuracy. Moreover, these results are shown to be consistent under distribution shift.
翻译:本文研究深度神经网络的选择性分类问题,即在模型可主动放弃低置信度预测以避免潜在错误的场景下。我们聚焦于所谓后处理方法,这类方法无需修改或重新训练给定分类器即可替换其置信度估计器,因而具有实际应用价值。针对采用softmax输出的神经网络,我们的目标是识别可直接从未归一化对数几率计算的最佳置信度估计器。该问题源于近期研究中的一个有趣发现:许多分类器的置信度估计器似乎存在"破损"现象,即其选择性分类性能远低于对应准确率所期望的水平。我们对来自主流模型库的84个预训练ImageNet分类器,开展了涵盖现有及新提出置信度估计器的系统性实验研究。结果表明,对对数几率进行简单p-范数归一化后,将最大对数几率值作为置信度估计器,可显著提升选择性分类性能,完全纠正众多分类器中观测到的病态行为。由此,任何分类器的选择性分类性能几乎完全由对应准确率决定。此外,这些结论在分布偏移场景下仍具有一致性。