Mitigating Overconfidence in Out-of-Distribution Detection by Capturing Extreme Activations

Detecting out-of-distribution (OOD) instances is crucial for the reliable deployment of machine learning models in real-world scenarios. OOD inputs are commonly expected to cause a more uncertain prediction in the primary task; however, there are OOD cases for which the model returns a highly confident prediction. This phenomenon, denoted as "overconfidence", presents a challenge to OOD detection. Specifically, theoretical evidence indicates that overconfidence is an intrinsic property of certain neural network architectures, leading to poor OOD detection. In this work, we address this issue by measuring extreme activation values in the penultimate layer of neural networks and then leverage this proxy of overconfidence to improve on several OOD detection baselines. We test our method on a wide array of experiments spanning synthetic data and real-world data, tabular and image datasets, multiple architectures such as ResNet and Transformer, different training loss functions, and include the scenarios examined in previous theoretical work. Compared to the baselines, our method often grants substantial improvements, with double-digit increases in OOD detection AUC, and it does not damage performance in any scenario.

翻译：检测分布外（OOD）实例对于机器学习模型在真实场景中的可靠部署至关重要。通常预期OOD输入会导致主任务预测的不确定性更高；然而，存在模型返回高度自信预测的OOD情况。这一被称为“过度自信”的现象对OOD检测构成了挑战。具体而言，理论证据表明，过度自信是某些神经网络架构的固有特性，导致OOD检测性能不佳。在本工作中，我们通过测量神经网络倒数第二层的极端激活值来解决此问题，并利用这一过度自信的代理指标改进多个OOD检测基线方法。我们在涵盖合成数据和真实数据、表格数据集和图像数据集、多种架构（如ResNet和Transformer）、不同训练损失函数以及先前理论研究中考察的场景等一系列广泛实验上测试了我们的方法。与基线方法相比，我们的方法通常带来显著改进，OOD检测AUC实现两位数提升，且在任何场景下均不损害性能。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日