Graph Neural Networks (GNNs) have shown remarkable performance in various tasks. However, recent works reveal that GNNs are vulnerable to backdoor attacks. Generally, backdoor attack poisons the graph by attaching backdoor triggers and the target class label to a set of nodes in the training graph. A GNN trained on the poisoned graph will then be misled to predict test nodes attached with trigger to the target class. Despite their effectiveness, our empirical analysis shows that triggers generated by existing methods tend to be out-of-distribution (OOD), which significantly differ from the clean data. Hence, these injected triggers can be easily detected and pruned with widely used outlier detection methods in real-world applications. Therefore, in this paper, we study a novel problem of unnoticeable graph backdoor attacks with in-distribution (ID) triggers. To generate ID triggers, we introduce an OOD detector in conjunction with an adversarial learning strategy to generate the attributes of the triggers within distribution. To ensure a high attack success rate with ID triggers, we introduce novel modules designed to enhance trigger memorization by the victim model trained on poisoned graph. Extensive experiments on real-world datasets demonstrate the effectiveness of the proposed method in generating in distribution triggers that can by-pass various defense strategies while maintaining a high attack success rate.
翻译:图神经网络(GNNs)在各种任务中表现出色。然而,近期研究表明GNNs易受后门攻击。通常,后门攻击通过在后门触发器和目标类别标签附加到训练图中的一组节点上来污染图。基于被污染图训练的GNN将被误导,将带有触发器的测试节点预测为目标类别。尽管现有方法有效,但我们的实证分析显示,现有方法生成的触发器往往呈现分布外(OOD)特征,与干净数据存在显著差异。因此,这些注入的触发器在实际应用中极易被广泛使用的异常检测方法检测并剪除。为此,本文研究了一个新颖问题——利用分布内(ID)触发器实现隐蔽的图后门攻击。为生成ID触发器,我们引入OOD检测器并结合对抗学习策略,生成分布内的触发器属性。为确保ID触发器的高攻击成功率,我们设计了专门模块,通过增强被污染图训练受害者模型对触发器的记忆能力。在真实数据集上的大量实验表明,所提方法能有效生成可绕过多种防御策略的分布内触发器,同时保持高攻击成功率。