Following a fast initial breakthrough in graph based learning, Graph Neural Networks (GNNs) have reached a widespread application in many science and engineering fields, prompting the need for methods to understand their decision process. GNN explainers have started to emerge in recent years, with a multitude of methods both novel or adapted from other domains. To sort out this plethora of alternative approaches, several studies have benchmarked the performance of different explainers in terms of various explainability metrics. However, these earlier works make no attempts at providing insights into why different GNN architectures are more or less explainable, or which explainer should be preferred in a given setting. In this survey, we fill these gaps by devising a systematic experimental study, which tests ten explainers on eight representative architectures trained on six carefully designed graph and node classification datasets. With our results we provide key insights on the choice and applicability of GNN explainers, we isolate key components that make them usable and successful and provide recommendations on how to avoid common interpretation pitfalls. We conclude by highlighting open questions and directions of possible future research.
翻译:随着基于图的学习方法取得快速突破,图神经网络(GNNs)已在众多科学与工程领域得到广泛应用,这促使人们需要理解其决策过程的方法。近年来,GNN解释方法开始涌现,其中既包含新颖技术也包含从其他领域移植的方法。为梳理这些纷繁的替代方案,已有若干研究基于多种可解释性指标对不同解释器的性能进行了基准测试。然而,早期研究并未试图深入阐释为何不同GNN架构具有差异化的可解释性,也未指明特定场景下应优先选择何种解释器。本综述通过设计系统性实验研究填补了这些空白:我们在六个精心设计的图与节点分类数据集上,对八种代表性架构测试了十种解释方法。基于实验结果,我们提供了关于GNN解释器选择与适用性的关键见解,分离了使其具备实用性与有效性的核心组件,并就如何避免常见解释误区提出了建议。最后,我们指出了开放性问题及未来可能的研究方向。