Evaluating the quality of post-hoc explanations for Graph Neural Networks (GNNs) remains a significant challenge. While recent years have seen an increasing development of explainability methods, current evaluation metrics (e.g., fidelity, sparsity) often fail to assess whether an explanation identifies the true underlying causal variables. To address this, we propose the Explanation-Generalization Score (EGS), a metric that quantifies the causal relevance of GNN explanations. EGS is founded on the principle of feature invariance and posits that if an explanation captures true causal drivers, it should lead to stable predictions across distribution shifts. To quantify this, we introduce a framework that trains GNNs using explanatory subgraphs and evaluates their performance in Out-of-Distribution (OOD) settings (here, OOD generalization serves as a rigorous proxy for the explanation's causal validity). Through large-scale validation involving 11,200 model combinations across synthetic and real-world datasets, our results demonstrate that EGS provides a principled benchmark for ranking explainers based on their ability to capture causal substructures, offering a robust alternative to traditional fidelity-based metrics.
翻译:评估图神经网络(GNN)事后解释的质量仍然是一个重大挑战。近年来,尽管可解释性方法不断发展,但当前的评估指标(如保真度、稀疏性)往往无法判断一个解释是否识别出了真实的潜在因果变量。为解决此问题,我们提出了解释泛化分数(EGS),这是一种量化GNN解释因果相关性的指标。EGS建立在特征不变性原理之上,其核心思想是:如果一个解释捕捉到了真实的因果驱动因素,那么它应能在分布变化下带来稳定的预测。为量化这一点,我们引入了一个框架,该框架使用解释性子图训练GNN,并在分布外(OOD)设置下评估其性能(在此,OOD泛化作为解释因果有效性的严格代理指标)。通过在合成和真实数据集上对11,200种模型组合进行大规模验证,我们的结果表明,EGS为根据解释器捕捉因果子结构的能力进行排序提供了一个原则性基准,为传统的基于保真度的指标提供了一个稳健的替代方案。