Understanding Parameter Saliency via Extreme Value Theory

Deep neural networks are being increasingly implemented throughout society in recent years. It is useful to identify which parameters trigger misclassification in diagnosing undesirable model behaviors. The concept of parameter saliency is proposed and used to diagnose convolutional neural networks (CNNs) by ranking convolution filters that may have caused misclassification on the basis of parameter saliency. It is also shown that fine-tuning the top ranking salient filters efficiently corrects misidentification on ImageNet. However, there is still a knowledge gap in terms of understanding why parameter saliency ranking can find the filters inducing misidentification. In this work, we attempt to bridge the gap by analyzing parameter saliency ranking from a statistical viewpoint, namely, extreme value theory. We first show that the existing work implicitly assumes that the gradient norm computed for each filter follows a normal distribution. Then, we clarify the relationship between parameter saliency and the score based on the peaks-over-threshold (POT) method, which is often used to model extreme values. Finally, we reformulate parameter saliency in terms of the POT method, where this reformulation is regarded as statistical anomaly detection and does not require the implicit assumptions of the existing parameter-saliency formulation. Our experimental results demonstrate that our reformulation can detect malicious filters as well. Furthermore, we show that the existing parameter saliency method exhibits a bias against the depth of layers in deep neural networks. In particular, this bias has the potential to inhibit the discovery of filters that cause misidentification in situations where domain shift occurs. In contrast, parameter saliency based on POT shows less of this bias.

翻译：近年来，深度神经网络在社会各领域的应用日益广泛。在诊断模型不良行为时，识别哪些参数会引发误分类具有重要意义。参数显著性概念被提出并用于诊断卷积神经网络，通过基于参数显著性对可能导致误分类的卷积滤波器进行排序。研究表明，对排名靠前的显著滤波器进行微调可有效纠正ImageNet上的误识别问题。然而，目前对参数显著性排序为何能找出导致误识别的滤波器仍存在认知空白。本文尝试从极值理论的统计学视角弥合这一空白。我们首先证明现有工作隐含假设每个滤波器计算的梯度范数服从正态分布，随后阐明参数显著性与常用于建模极值的超越阈值法得分之间的关系。最后，我们基于超越阈值法重构参数显著性，该重构方法可视为统计异常检测，且无需现有参数显著性公式的隐含假设。实验结果表明，我们的重构方法同样能有效检测恶意滤波器。此外，我们发现现有参数显著性方法对深度神经网络中层的深度存在偏差，这种偏差在域偏移发生时可能阻碍导致误识别的滤波器发现。相比之下，基于超越阈值法的参数显著性对此偏差表现更小。