Explainability of intelligent models has been garnering increasing attention in recent years. Of the various explainability approaches, concept-based techniques are notable for utilizing a set of human-meaningful concepts instead of focusing on individual pixels. However, there is a scarcity of methods that consistently provide both local and global explanations. Moreover, most of the methods have no offer to explain misclassification cases. To address these challenges, our study follows a straightforward yet effective approach. We propose a unified concept-based system, which inputs a number of super-pixelated images into the networks, allowing them to learn better representations of the target's objects as well as the target's concepts. This method automatically learns, scores, and extracts local and global concepts. Our experiments revealed that, in addition to enhancing performance, the models could provide deeper insights into predictions and elucidate false classifications.
翻译:近年来,智能模型的可解释性日益受到关注。在众多可解释性方法中,基于概念的技术因其利用一组人类可理解的概念而非聚焦于单个像素而引人注目。然而,目前缺乏能够同时提供局部与全局解释的连贯方法,且多数方法无法解释误分类案例。为应对这些挑战,本研究遵循一条简单而有效的路径,提出一个统一的概念系统:向网络输入若干超像素图像,使其既能学习目标对象的更优表征,也能学习目标的概念。该方法自动学习、评分并提取局部与全局概念。实验表明,该模型在提升性能的同时,还能为预测提供更深层次的见解并阐明错误分类的原因。