Following a fast initial breakthrough in graph based learning, Graph Neural Networks (GNNs) have reached a widespread application in many science and engineering fields, prompting the need for methods to understand their decision process. GNN explainers have started to emerge in recent years, with a multitude of methods both novel or adapted from other domains. To sort out this plethora of alternative approaches, several studies have benchmarked the performance of different explainers in terms of various explainability metrics. However, these earlier works make no attempts at providing insights into why different GNN architectures are more or less explainable, or which explainer should be preferred in a given setting. In this survey, we fill these gaps by devising a systematic experimental study, which tests ten explainers on eight representative architectures trained on six carefully designed graph and node classification datasets. With our results we provide key insights on the choice and applicability of GNN explainers, we isolate key components that make them usable and successful and provide recommendations on how to avoid common interpretation pitfalls. We conclude by highlighting open questions and directions of possible future research.
翻译:继基于图学习的快速初始突破之后,图神经网络(GNN)已在众多科学与工程领域得到广泛应用,这促使人们需要理解其决策过程的方法。近年来,GNN解释器开始涌现,其中既有新颖方法,也有从其他领域改编而来的方法。为理清这些层出不穷的替代方法,已有若干研究基于各种可解释性指标对不同解释器的性能进行了基准测试。然而,这些早期工作并未尝试揭示为何不同GNN架构具有不同可解释性程度,也未说明在特定情境下应优先选择哪种解释器。在本综述中,我们通过设计一项系统性实验研究来填补这些空白,该研究在六个精心设计的图分类与节点分类数据集上,针对八种代表性架构测试了十个解释器。基于实验结果,我们提供了关于GNN解释器选择与适用性的关键见解,分离出使其可用且成功的核心组件,并就如何避免常见的解释误区提出建议。最后,我们总结了待解决的关键问题及未来可能的研究方向。