It is important to quantify the uncertainty of input samples, especially in mission-critical domains such as autonomous driving and healthcare, where failure predictions on out-of-distribution (OOD) data are likely to cause big problems. OOD detection problem fundamentally begins in that the model cannot express what it is not aware of. Post-hoc OOD detection approaches are widely explored because they do not require an additional re-training process which might degrade the model's performance and increase the training cost. In this study, from the perspective of neurons in the deep layer of the model representing high-level features, we introduce a new aspect for analyzing the difference in model outputs between in-distribution data and OOD data. We propose a novel method, Leveraging Important Neurons (LINe), for post-hoc Out of distribution detection. Shapley value-based pruning reduces the effects of noisy outputs by selecting only high-contribution neurons for predicting specific classes of input data and masking the rest. Activation clipping fixes all values above a certain threshold into the same value, allowing LINe to treat all the class-specific features equally and just consider the difference between the number of activated feature differences between in-distribution and OOD data. Comprehensive experiments verify the effectiveness of the proposed method by outperforming state-of-the-art post-hoc OOD detection methods on CIFAR-10, CIFAR-100, and ImageNet datasets.
翻译:量化输入样本的不确定性至关重要,尤其在自动驾驶和医疗等关键任务领域,对分布外数据的预测失误可能引发重大问题。分布外检测问题的根本在于模型无法表达其未知的内容。基于事后分析的分布外检测方法由于无需额外重新训练(这可能降低模型性能并增加训练成本)而得到广泛探索。本研究从表示高层特征的模型深层神经元视角出发,引入了一种分析分布内数据与分布外数据模型输出差异的新维度。我们提出了一种新颖的事后分布外检测方法——利用重要神经元(LINe)。基于沙普利值的剪枝通过仅选择对预测特定输入数据类别贡献高的神经元并屏蔽其余神经元,降低了噪声输出的影响。激活裁剪将所有高于特定阈值的值固定为相同值,使LINe能够平等处理所有类别特定特征,仅考虑分布内与分布外数据间激活特征数量的差异。综合实验表明,该方法在CIFAR-10、CIFAR-100和ImageNet数据集上优于最先进的事后分布外检测方法,验证了其有效性。