Convolutional neural networks (CNNs) are increasingly being used in critical systems, where robustness and alignment are crucial. In this context, the field of explainable artificial intelligence has proposed the generation of high-level explanations of the prediction process of CNNs through concept extraction. While these methods can detect whether or not a concept is present in an image, they are unable to determine its location. What is more, a fair comparison of such approaches is difficult due to a lack of proper validation procedures. To address these issues, we propose a novel method for automatic concept extraction and localization based on representations obtained through pixel-wise aggregations of CNN activation maps. Further, we introduce a process for the validation of concept-extraction techniques based on synthetic datasets with pixel-wise annotations of their main components, reducing the need for human intervention. Extensive experimentation on both synthetic and real-world datasets demonstrates that our method outperforms state-of-the-art alternatives.
翻译:卷积神经网络(CNN)正日益被应用于关键系统,其中鲁棒性与对齐性至关重要。在此背景下,可解释人工智能领域通过概念提取技术生成CNN预测过程的高层解释。现有方法虽能检测图像中是否存在某个概念,但无法定位其空间位置。此外,由于缺乏规范的验证流程,此类方法难以进行公平比较。针对这些问题,我们提出了一种基于CNN激活图逐像素聚合表征的新型自动概念提取与定位方法。进一步地,我们引入了一种基于主成分像素级标注合成数据集的概念提取技术验证流程,有效减少了人工干预需求。在合成数据集与真实数据集上的大量实验表明,本方法性能优于当前最优替代方案。