Ensuring the trustworthiness of graph neural networks (GNNs), which are often treated as black-box models, requires effective explanation techniques. Existing GNN explanations typically apply input perturbations to identify subgraphs that are responsible for the occurrence of the final output of GNNs. However, such approaches lack finer-grained, layer-wise analysis of how intermediate representations contribute to the final result, capabilities that are crucial for model diagnosis and architecture optimization. This paper introduces SliceGX, a novel GNN explanation approach that generates explanations at specific GNN layers in a progressive manner. Given a GNN model M, a set of selected intermediate layers, and a target layer, SliceGX slices M into layer blocks("model slice") and discovers high-quality explanatory subgraphs within each block that elucidate how the model output arises at the target layer. Although finding such layer-wise explanations is computationally challenging, we develop efficient algorithms and optimization techniques that incrementally construct and maintain these subgraphs with provable approximation guarantees. Extensive experiments on synthetic and real-world benchmarks demonstrate the effectiveness and efficiency of SliceGX, and illustrate its practical utility in supporting model debugging.
翻译:确保常被视为黑盒模型的图神经网络(GNN)的可信性需要有效的解释技术。现有的GNN解释方法通常通过输入扰动来识别导致GNN最终输出产生的子图。然而,此类方法缺乏对中间表示如何影响最终结果的细粒度逐层分析能力,而这种能力对于模型诊断与架构优化至关重要。本文提出SliceGX——一种新颖的GNN解释方法,能够以渐进方式在特定GNN层生成解释。给定GNN模型M、一组选定的中间层及目标层,SliceGX将M切分为层块(“模型切片”),并在每个块内发现高质量的解释子图,以阐明目标层的模型输出如何形成。尽管寻找此类逐层解释具有计算挑战性,我们开发了高效算法与优化技术,能以可证明的近似保证增量构建并维护这些子图。在合成与真实基准数据集上的大量实验证明了SliceGX的有效性与高效性,并展示了其在支持模型调试方面的实际应用价值。