Explainability is a key component in many applications involving deep neural networks (DNNs). However, current explanation methods for DNNs commonly leave it to the human observer to distinguish relevant explanations from spurious noise. This is not feasible anymore when going from easily human-accessible data such as images to more complex data such as genome sequences. To facilitate the accessibility of DNN outputs from such complex data and to increase explainability, we present a modification of the widely used explanation method layer-wise relevance propagation. Our approach enforces sparsity directly by pruning the relevance propagation for the different layers. Thereby, we achieve sparser relevance attributions for the input features as well as for the intermediate layers. As the relevance propagation is input-specific, we aim to prune the relevance propagation rather than the underlying model architecture. This allows to prune different neurons for different inputs and hence, might be more appropriate to the local nature of explanation methods. To demonstrate the efficacy of our method, we evaluate it on two types of data, images and genomic sequences. We show that our modification indeed leads to noise reduction and concentrates relevance on the most important features compared to the baseline.
翻译:可解释性是深度神经网络(DNN)在许多应用中的关键组成部分。然而,当前DNN的解释方法通常需要人类观察者自行区分相关解释与虚假噪声。当从图像等易于人类理解的数据转向基因组序列等更复杂的数据时,这一做法将不再可行。为促进对这类复杂数据DNN输出的可访问性并增强可解释性,我们提出对广泛使用的解释方法——逐层相关性传播进行改进。我们的方法通过直接剪枝各层的相关性传播来实现稀疏性,从而在输入特征以及中间层上获得更稀疏的相关性归因。由于相关性传播具有输入特异性,我们旨在剪枝相关性传播本身而非底层模型架构。这使得不同输入可以剪枝不同神经元,从而更契合解释方法的局部特性。为证明方法的有效性,我们在图像与基因组序列两类数据上进行了评估。结果表明,与基线相比,我们的改进确实实现了噪声抑制,并将相关性集中于最重要的特征上。