Graph Neural Networks (GNNs) have shown remarkable performance in various tasks. However, recent works reveal that GNNs are vulnerable to backdoor attacks. Generally, backdoor attack poisons the graph by attaching backdoor triggers and the target class label to a set of nodes in the training graph. A GNN trained on the poisoned graph will then be misled to predict test nodes attached with trigger to the target class. Despite their effectiveness, our empirical analysis shows that triggers generated by existing methods tend to be out-of-distribution (OOD), which significantly differ from the clean data. Hence, these injected triggers can be easily detected and pruned with widely used outlier detection methods in real-world applications. Therefore, in this paper, we study a novel problem of unnoticeable graph backdoor attacks with in-distribution (ID) triggers. To generate ID triggers, we introduce an OOD detector in conjunction with an adversarial learning strategy to generate the attributes of the triggers within distribution. To ensure a high attack success rate with ID triggers, we introduce novel modules designed to enhance trigger memorization by the victim model trained on poisoned graph. Extensive experiments on real-world datasets demonstrate the effectiveness of the proposed method in generating in distribution triggers that can by-pass various defense strategies while maintaining a high attack success rate.
翻译:图神经网络(GNNs)在各种任务中展现出卓越性能。然而,近期研究表明GNNs易受后门攻击。通常,后门攻击通过在训练图中为部分节点附加后门触发器及目标类别标签来污染图数据。在污染图上训练的GNN模型会被误导,将附加触发器的测试节点预测为目标类别。尽管现有方法具有攻击有效性,但我们的实证分析表明,现有方法生成的触发器往往呈现分布外特征,与干净数据存在显著差异。因此,这些注入的触发器在实际应用中极易被广泛使用的离群点检测方法识别并剔除。为此,本文研究具有分布内触发器的隐蔽图后门攻击这一新问题。为生成分布内触发器,我们引入离群点检测器并结合对抗学习策略,在分布范围内生成触发器的属性特征。为确保分布内触发器实现高攻击成功率,我们设计了新颖的模块以增强受害模型对污染图中触发器的记忆能力。在真实数据集上的大量实验表明,所提方法能有效生成分布内触发器,在保持高攻击成功率的同时规避多种防御策略。