Decisions made by convolutional neural networks(CNN) can be understood and explained by visualizing discriminative regions on images. To this end, Class Activation Map (CAM) based methods were proposed as powerful interpretation tools, making the prediction of deep learning models more explainable, transparent, and trustworthy. However, all the CAM-based methods (e.g., CAM, Grad-CAM, and Relevance-CAM) can only be used for interpreting CNN models with fully-connected (FC) layers as a classifier. It is worth noting that many deep learning models classify images without FC layers, e.g., few-shot learning image classification, contrastive learning image classification, and image retrieval tasks. In this work, a post-hoc interpretation tool named feature activation map (FAM) is proposed, which can interpret deep learning models without FC layers as a classifier. In the proposed FAM algorithm, the channel-wise contribution weights are derived from the similarity scores between two image embeddings. The activation maps are linearly combined with the corresponding normalized contribution weights, forming the explanation map for visualization. The quantitative and qualitative experiments conducted on ten deep learning models for few-shot image classification, contrastive learning image classification and image retrieval tasks demonstrate the effectiveness of the proposed FAM algorithm.
翻译:卷积神经网络(CNN)的决策可以通过可视化图像上的判别性区域来理解和解释。为此,基于类激活图(CAM)的方法被提出作为强大的解释工具,使深度学习模型的预测更具可解释性、透明性和可信度。然而,所有基于CAM的方法(例如CAM、Grad-CAM和Relevance-CAM)仅能用于解释以全连接(FC)层作为分类器的CNN模型。值得注意的是,许多深度学习模型在图像分类中并不使用FC层,例如小样本学习图像分类、对比学习图像分类和图像检索任务。本文提出了一种名为特征激活图(FAM)的事后解释工具,该工具能够解释不使用FC层作为分类器的深度学习模型。在提出的FAM算法中,通道级贡献权重通过两个图像嵌入之间的相似度得分推导得出。激活图与相应的归一化贡献权重线性组合,形成用于可视化的解释图。在十种深度学习模型上针对小样本图像分类、对比学习图像分类和图像检索任务的定量与定性实验表明,所提出的FAM算法具有有效性。