Saliency maps have become one of the most widely used interpretability techniques for convolutional neural networks (CNN) due to their simplicity and the quality of the insights they provide. However, there are still some doubts about whether these insights are a trustworthy representation of what CNNs use to come up with their predictions. This paper explores how rescuing the sign of the gradients from the saliency map can lead to a deeper understanding of multi-class classification problems. Using both pretrained and trained from scratch CNNs we unveil that considering the sign and the effect not only of the correct class, but also the influence of the other classes, allows to better identify the pixels of the image that the network is really focusing on. Furthermore, how occluding or altering those pixels is expected to affect the outcome also becomes clearer.
翻译:显著性图因其实用性与洞察质量,已成为卷积神经网络(CNN)最广泛应用的可解释性技术之一。然而,这些洞察能否真实反映CNN预测所依赖的特征仍存疑虑。本文探索如何通过保留显著性图中梯度的符号信息,深化对多分类问题的理解。利用预训练与从头训练的CNN模型,我们揭示:不仅考虑正确类别的梯度符号与影响,同时纳入其他类别的贡献,能更准确识别网络真正关注的图像像素。此外,遮挡或改变这些像素对预测结果的影响机制也因此更加清晰。