BadImplant: Injection-based Multi-Targeted Graph Backdoor Attack

Graph neural network (GNN) have demonstrated exceptional performance in solving critical problems across diverse domains yet remain susceptible to backdoor attacks. Existing studies on backdoor attack for graph classification are limited to single target attack using subgraph replacement based mechanism where the attacker implants only one trigger into the GNN model. In this paper, we introduce the first multi-targeted backdoor attack for graph classification task, where multiple triggers simultaneously redirect predictions to different target labels. Instead of subgraph replacement, we propose subgraph injection which preserves the structure of the original graphs while poisoning the clean graphs. Extensive experiments demonstrate the efficacy of our approach, where our attack achieves high attack success rates for all target labels with minimal impact on the clean accuracy. Experimental results on five dataset demonstrate the superior performance of our attack framework compared to the conventional subgraph replacement-based attack. Our analysis on four GNN models confirms the generalization capability of our attack which is effective regardless of the GNN model architectures and training parameters settings. We further investigate the impact of the attack design parameters including injection methods, number of connections, trigger sizes, trigger edge density and poisoning ratios. Additionally, our evaluation against state-of-the-art defenses (randomized smoothing and fine-pruning) demonstrates the robustness of our proposed multi-target attacks. This work highlights the GNN vulnerability against multi-targeted backdoor attack in graph classification task. Our source codes will be available at https://github.com/SiSL-URI/Multi-Targeted-Graph-Backdoor-Attack.

翻译：图神经网络（GNN）在解决跨领域关键问题中展现出卓越性能，但仍易遭受后门攻击。现有面向图分类任务的后门攻击研究局限于基于子图替换机制的单目标攻击，攻击者仅向GNN模型植入一个触发器。本文首次提出面向图分类任务的多目标后门攻击，其中多个触发器可同时将预测结果重定向至不同目标标签。我们不再采用子图替换，而是提出子图注入方法，在污染干净图的同时保留其原始结构。大量实验证明了我们方法的有效性：攻击对所有目标标签均实现了高攻击成功率，且对干净准确率影响极小。在五个数据集上的实验结果表明，与基于子图替换的传统攻击相比，我们的攻击框架展现出更优性能。对四种GNN模型的分析证实了我们攻击的泛化能力，无论GNN模型架构和训练参数设置如何均有效。我们进一步探究了攻击设计参数（包括注入方法、连接数、触发器大小、触发器边密度及投毒比例）的影响。此外，针对最先进防御方法（随机平滑与细粒度剪枝）的评估表明，我们提出的多目标攻击具有鲁棒性。本研究揭示了图分类任务中GNN在多目标后门攻击下的脆弱性。我们的源代码将在https://github.com/SiSL-URI/Multi-Targeted-Graph-Backdoor-Attack 公开。