Recently, the explanation of neural network models has garnered considerable research attention. In computer vision, CAM (Class Activation Map)-based methods and LRP (Layer-wise Relevance Propagation) method are two common explanation methods. However, since most CAM-based methods can only generate global weights, they can only generate coarse-grained explanations at a deep layer. LRP and its variants, on the other hand, can generate fine-grained explanations. But the faithfulness of the explanations is too low. To address these challenges, in this paper, we propose FG-CAM (Fine-Grained CAM), which extends CAM-based methods to enable generating fine-grained and high-faithfulness explanations. FG-CAM uses the relationship between two adjacent layers of feature maps with resolution differences to gradually increase the explanation resolution, while finding the contributing pixels and filtering out the pixels that do not contribute. Our method not only solves the shortcoming of CAM-based methods without changing their characteristics, but also generates fine-grained explanations that have higher faithfulness than LRP and its variants. We also present FG-CAM with denoising, which is a variant of FG-CAM and is able to generate less noisy explanations with almost no change in explanation faithfulness. Experimental results show that the performance of FG-CAM is almost unaffected by the explanation resolution. FG-CAM outperforms existing CAM-based methods significantly in both shallow and intermediate layers, and outperforms LRP and its variants significantly in the input layer. Our code is available at https://github.com/dongmo-qcq/FG-CAM.
翻译:近年来,神经网络模型的解释性研究备受关注。在计算机视觉领域,基于CAM(类激活图)的方法和LRP(逐层相关性传播)方法是两种常见的解释方法。然而,由于大多数基于CAM的方法只能生成全局权重,因此它们只能在深层生成粗粒度的解释。而LRP及其变体虽然能够生成细粒度解释,但其解释的保真度过低。为解决这些问题,本文提出FG-CAM(细粒度CAM),该方法扩展了基于CAM的方法,使其能够生成细粒度且高保真的解释。FG-CAM利用相邻两层分辨率不同的特征图之间的关系,逐步提升解释分辨率,同时定位贡献像素并过滤无关像素。该方法不仅在不改变原有特性前提下解决了基于CAM方法的不足,还能生成比LRP及其变体保真度更高的细粒度解释。我们还提出了带去噪的FG-CAM,作为FG-CAM的一种变体,它能在几乎不改变解释保真度的情况下生成噪声更少的解释。实验结果表明,FG-CAM的性能几乎不受解释分辨率影响。FG-CAM在浅层和中间层的表现显著优于现有基于CAM的方法,在输入层的表现也显著优于LRP及其变体。我们的代码已开源:https://github.com/dongmo-qcq/FG-CAM。