Deep neural networks tend to make overconfident predictions and often require additional detectors for misclassifications, particularly for safety-critical applications. Existing detection methods usually only focus on adversarial attacks or out-of-distribution samples as reasons for false predictions. However, generalization errors occur due to diverse reasons often related to poorly learning relevant invariances. We therefore propose GIT, a holistic approach for the detection of generalization errors that combines the usage of gradient information and invariance transformations. The invariance transformations are designed to shift misclassified samples back into the generalization area of the neural network, while the gradient information measures the contradiction between the initial prediction and the corresponding inherent computations of the neural network using the transformed sample. Our experiments demonstrate the superior performance of GIT compared to the state-of-the-art on a variety of network architectures, problem setups and perturbation types.
翻译:摘要:深度神经网络常做出过度自信的预测,对误分类情况往往需要额外检测器,尤其在安全关键应用中。现有检测方法通常仅关注对抗攻击或分布外样本作为错误预测的成因。然而,泛化误差的产生源于多种原因,常与未能有效学习相关不变性有关。为此,我们提出GIT——一种结合梯度信息与不变性变换的泛化误差整体检测方法。不变性变换旨在将误分类样本重新移回神经网络的泛化区域,而梯度信息则通过变换后的样本衡量初始预测与神经网络相应固有计算之间的矛盾。实验表明,GIT在多种网络架构、问题设置及扰动类型上均展现出优于现有技术的性能。