Convolutional Neural Networks (CNNs) are nowadays the model of choice in Computer Vision, thanks to their ability to automatize the feature extraction process in visual tasks. However, the knowledge acquired during training is fully subsymbolic, and hence difficult to understand and explain to end users. In this paper, we propose a new technique called HOLMES (HOLonym-MEronym based Semantic inspection) that decomposes a label into a set of related concepts, and provides component-level explanations for an image classification model. Specifically, HOLMES leverages ontologies, web scraping and transfer learning to automatically construct meronym (parts)-based detectors for a given holonym (class). Then, it produces heatmaps at the meronym level and finally, by probing the holonym CNN with occluded images, it highlights the importance of each part on the classification output. Compared to state-of-the-art saliency methods, HOLMES takes a step further and provides information about both where and what the holonym CNN is looking at, without relying on densely annotated datasets and without forcing concepts to be associated to single computational units. Extensive experimental evaluation on different categories of objects (animals, tools and vehicles) shows the feasibility of our approach. On average, HOLMES explanations include at least two meronyms, and the ablation of a single meronym roughly halves the holonym model confidence. The resulting heatmaps were quantitatively evaluated using the deletion/insertion/preservation curves. All metrics were comparable to those achieved by GradCAM, while offering the advantage of further decomposing the heatmap in human-understandable concepts, thus highlighting both the relevance of meronyms to object classification, as well as HOLMES ability to capture it. The code is available at https://github.com/FrancesC0de/HOLMES.
翻译:卷积神经网络(CNN)凭借其在视觉任务中自动化特征提取的能力,已成为当今计算机视觉领域的首选模型。然而,训练过程中获取的知识完全处于亚符号层面,因此难以被最终用户理解和解释。本文提出一种名为HOLMES(基于整体-部分语义检查)的新技术,该方法将一个标签分解为若干相关概念集,并为图像分类模型提供组件级解释。具体而言,HOLMES利用本体论、网络爬虫和迁移学习,自动构建针对特定整体概念(类)的基于部分概念的检测器。随后,它在部分概念层面生成热力图,并最终通过遮挡图像探测整体CNN模型,突出显示各个部分对分类输出的贡献程度。与当前最优的显著性方法相比,HOLMES更进一步,无需依赖密集标注数据集,也不强制将概念与单一计算单元关联,即可提供关于整体CNN模型"看哪里"及"看什么"的双重信息。针对不同类别物体(动物、工具和交通工具)的大量实验评估表明了我们方法的可行性。平均而言,HOLMES解释包含至少两个部分概念,且移除单个部分概念会使整体模型置信度降低约一半。生成的热力图采用删除/插入/保留曲线进行定量评估,所有指标均达到与GradCAM相当的水平,同时具备进一步将热力图解构为人类可理解概念的优势,从而同时凸显了部分概念对于物体分类的相关性以及HOLMES捕捉这种相关性的能力。代码已开源在https://github.com/FrancesC0de/HOLMES。