Graph Neural Networks (GNNs) have become a building block in graph data processing, with wide applications in critical domains. The growing needs to deploy GNNs in high-stakes applications necessitate explainability for users in the decision-making processes. A popular paradigm for the explainability of GNNs is to identify explainable subgraphs by comparing their labels with the ones of original graphs. This task is challenging due to the substantial distributional shift from the original graphs in the training set to the set of explainable subgraphs, which prevents accurate prediction of labels with the subgraphs. To address it, in this paper, we propose a novel method that generates proxy graphs for explainable subgraphs that are in the distribution of training data. We introduce a parametric method that employs graph generators to produce proxy graphs. A new training objective based on information theory is designed to ensure that proxy graphs not only adhere to the distribution of training data but also preserve explanatory factors. Such generated proxy graphs can be reliably used to approximate the predictions of the labels of explainable subgraphs. Empirical evaluations across various datasets demonstrate our method achieves more accurate explanations for GNNs.
翻译:图神经网络(GNNs)已成为图数据处理的基础模块,在诸多关键领域得到广泛应用。随着GNNs在高风险应用中部署的需求日益增长,用户在其决策过程中对模型可解释性的要求也愈发迫切。GNN可解释性的一种主流范式是通过比较可解释子图与原始图的预测标签来识别解释性子图。由于从训练集中的原始图到可解释子图集合存在显著的分布偏移,导致难以基于子图准确预测标签,该任务面临巨大挑战。为解决此问题,本文提出一种创新方法,能够为可解释子图生成符合训练数据分布的代理图。我们引入了一种参数化方法,利用图生成器来产生代理图。基于信息论设计的新训练目标确保代理图不仅遵循训练数据的分布,同时保留解释性因子。如此生成的代理图可被可靠地用于近似预测可解释子图的标签。跨多个数据集的实证评估表明,我们的方法能为GNNs实现更精确的解释。