Counterfactual examples have emerged as an effective approach to produce simple and understandable post-hoc explanations. In the context of graph classification, previous work has focused on generating counterfactual explanations by manipulating the most elementary units of a graph, i.e., removing an existing edge, or adding a non-existing one. In this paper, we claim that such language of explanation might be too fine-grained, and turn our attention to some of the main characterizing features of real-world complex networks, such as the tendency to close triangles, the existence of recurring motifs, and the organization into dense modules. We thus define a general density-based counterfactual search framework to generate instance-level counterfactual explanations for graph classifiers, which can be instantiated with different notions of dense substructures. In particular, we show two specific instantiations of this general framework: a method that searches for counterfactual graphs by opening or closing triangles, and a method driven by maximal cliques. We also discuss how the general method can be instantiated to exploit any other notion of dense substructures, including, for instance, a given taxonomy of nodes. We evaluate the effectiveness of our approaches in 7 brain network datasets and compare the counterfactual statements generated according to several widely-used metrics. Results confirm that adopting a semantic-relevant unit of change like density is essential to define versatile and interpretable counterfactual explanation methods.
翻译:对抗性示例已成为生成简洁且易理解的事后解释的有效方法。在图分类领域,以往工作集中于通过操控图的最基本单元(如删除现有边或添加不存在边)来生成对抗性解释。本文指出这种解释语言可能过于细粒度,并转而关注真实世界复杂网络的主要特征,例如闭合三角形的倾向、重复模体的存在以及组织成密集模块的特性。因此,我们定义了一个通用的基于密度的对抗性搜索框架,用于为图分类器生成实例级对抗性解释,该框架可实例化为不同形式的密集子结构。具体而言,我们展示了该通用框架的两种特化实现:一种通过打开或闭合三角形搜索对抗性图的方法,另一种基于最大团驱动的方法。我们还讨论了如何将该通用方法实例化以利用其他密集子结构概念(例如给定节点分类体系)进行解释。我们在7个脑网络数据集上评估了这些方法的有效性,并根据多个广泛使用的指标比较了生成的对抗性陈述。结果证实,采用密度这类语义相关的变化单元对于定义通用且可解释的对抗性解释方法至关重要。