Graph Neural Networks (GNNs) have recently been widely adopted in multiple domains. Yet, they are notably vulnerable to adversarial and backdoor attacks. In particular, backdoor attacks based on subgraph insertion have been shown to be effective in graph classification tasks while being stealthy, successfully circumventing various existing defense methods. In this paper, we propose E-SAGE, a novel approach to defending GNN backdoor attacks based on explainability. We find that the malicious edges and benign edges have significant differences in the importance scores for explainability evaluation. Accordingly, E-SAGE adaptively applies an iterative edge pruning process on the graph based on the edge scores. Through extensive experiments, we demonstrate the effectiveness of E-SAGE against state-of-the-art graph backdoor attacks in different attack settings. In addition, we investigate the effectiveness of E-SAGE against adversarial attacks.
翻译:图神经网络(GNNs)近年来在多个领域得到广泛应用,但其对对抗性攻击和后门攻击表现出显著脆弱性。特别是基于子图插入的后门攻击,在图分类任务中已被证明既隐蔽又有效,能够成功规避多种现有防御方法。本文提出E-SAGE,一种基于可解释性的新型GNN后门攻击防御方法。我们发现恶意边与良性边在可解释性评估的重要性分数上存在显著差异。基于此,E-SAGE根据边的重要性分数对图结构自适应地实施迭代剪边处理。通过大量实验,我们在不同攻击场景下验证了E-SAGE针对前沿图后门攻击的有效性。此外,我们还探究了E-SAGE在防御对抗性攻击方面的有效性。