Multi-Targeted Graph Backdoor Attack

Graph neural network (GNN) have demonstrated exceptional performance in solving critical problems across diverse domains yet remain susceptible to backdoor attacks. Existing studies on backdoor attack for graph classification are limited to single target attack using subgraph replacement based mechanism where the attacker implants only one trigger into the GNN model. In this paper, we introduce the first multi-targeted backdoor attack for graph classification task, where multiple triggers simultaneously redirect predictions to different target labels. Instead of subgraph replacement, we propose subgraph injection which preserves the structure of the original graphs while poisoning the clean graphs. Extensive experiments demonstrate the efficacy of our approach, where our attack achieves high attack success rates for all target labels with minimal impact on the clean accuracy. Experimental results on five dataset demonstrate the superior performance of our attack framework compared to the conventional subgraph replacement-based attack. Our analysis on four GNN models confirms the generalization capability of our attack which is effective regardless of the GNN model architectures and training parameters settings. We further investigate the impact of the attack design parameters including injection methods, number of connections, trigger sizes, trigger edge density and poisoning ratios. Additionally, our evaluation against state-of-the-art defenses (randomized smoothing and fine-pruning) demonstrates the robustness of our proposed multi-target attacks. This work highlights the GNN vulnerability against multi-targeted backdoor attack in graph classification task. Our source codes will be available at https://github.com/SiSL-URI/Multi-Targeted-Graph-Backdoor-Attack.

翻译：图神经网络（GNN）在解决各关键领域问题中展现出卓越性能，但仍易受后门攻击。现有针对图分类的后门攻击研究仅限于基于子图替换机制的单目标攻击，即攻击者仅向GNN模型中植入单一触发器。本文首次提出面向图分类任务的多目标后门攻击，其中多个触发器可同时将预测结果导向不同目标标签。我们提出采用子图注入而非子图替换的方法，在毒化干净图的同时保持原始图的结构。大量实验证明了我们方法的有效性：该攻击在保持干净准确率影响最小的前提下，对所有目标标签均实现了高攻击成功率。在五个数据集上的实验结果表明，我们的攻击框架性能优于传统的基于子图替换的攻击。对四种GNN模型的分析证实了本攻击的泛化能力——其有效性不受GNN模型架构与训练参数设置的影响。我们进一步研究了攻击设计参数的影响，包括注入方法、连接数量、触发器大小、触发器边密度和毒化比例。此外，针对前沿防御方法（随机平滑与剪枝微调）的评估表明，我们提出的多目标攻击具有强鲁棒性。本研究揭示了图分类任务中GNN面对多目标后门攻击的脆弱性。源代码发布于 https://github.com/SiSL-URI/Multi-Targeted-Graph-Backdoor-Attack。