GCFX: Generative Counterfactual Explanations for Deep Graph Models at the Model Level

Deep graph learning models have demonstrated remarkable capabilities in processing graph-structured data and have been widely applied across various fields. However, their complex internal architectures and lack of transparency make it difficult to explain their decisions, resulting in opaque models that users find hard to understand and trust. In this paper, we explore model-level explanation techniques for deep graph learning models, aiming to provide users with a comprehensive understanding of the models' overall decision-making processes and underlying mechanisms. Specifically, we address the problem of counterfactual explanations for deep graph learning models by introducing a generative model-level counterfactual explanation approach called GCFX, which is based on deep graph generation. This approach generates a set of high-quality counterfactual explanations that reflect the model's global predictive behavior by leveraging an enhanced deep graph generation framework and a global summarization algorithm. GCFX features an architecture that combines dual encoders, structure-aware taggers, and Message Passing Neural Network decoders, enabling it to accurately learn the true latent distribution of input data and generate high-quality, closely related counterfactual examples. Subsequently, a global counterfactual summarization algorithm selects the most representative and comprehensive explanations from numerous candidate counterfactuals, providing broad insights into the model's global predictive patterns. Experiments on a synthetic dataset and several real-world datasets demonstrate that GCFX outperforms existing methods in terms of counterfactual validity and coverage while maintaining low explanation costs, thereby offering crucial support for enhancing the practicality and trustworthiness of global counterfactual explanations.

翻译：深度图学习模型在处理图结构数据方面展现出卓越能力，并已在多个领域得到广泛应用。然而，其复杂的内部架构与缺乏透明度的特性使得解释其决策过程变得困难，导致模型的不透明性使用户难以理解和信任。本文探索面向深度图学习模型的模型级解释技术，旨在为用户提供对模型整体决策过程及内在机制的全面理解。具体而言，我们通过引入一种基于深度图生成的可生成式模型级反事实解释方法GCFX，以解决深度图学习模型的反事实解释问题。该方法通过增强的深度图生成框架与全局归纳算法，生成一组反映模型全局预测行为的高质量反事实解释。GCFX采用融合双编码器、结构感知标记器与消息传递神经网络解码器的架构，使其能够准确学习输入数据的真实潜在分布，并生成高质量、强关联的反事实示例。随后，全局反事实归纳算法从大量候选反事实中筛选最具代表性和全面性的解释，从而为模型的全局预测模式提供广泛洞察。在合成数据集及多个真实数据集上的实验表明，GCFX在反事实有效性与覆盖度方面优于现有方法，同时保持较低的解释成本，从而为提升全局反事实解释的实用性与可信度提供了关键支持。