Graph Neural Networks (GNNs) have shown remarkable performance in various tasks. However, recent works reveal that GNNs are vulnerable to backdoor attacks. Generally, backdoor attack poisons the graph by attaching backdoor triggers and the target class label to a set of nodes in the training graph. A GNN trained on the poisoned graph will then be misled to predict test nodes attached with trigger to the target class. Despite their effectiveness, our empirical analysis shows that triggers generated by existing methods tend to be out-of-distribution (OOD), which significantly differ from the clean data. Hence, these injected triggers can be easily detected and pruned with widely used outlier detection methods in real-world applications. Therefore, in this paper, we study a novel problem of unnoticeable graph backdoor attacks with in-distribution (ID) triggers. To generate ID triggers, we introduce an OOD detector in conjunction with an adversarial learning strategy to generate the attributes of the triggers within distribution. To ensure a high attack success rate with ID triggers, we introduce novel modules designed to enhance trigger memorization by the victim model trained on poisoned graph. Extensive experiments on real-world datasets demonstrate the effectiveness of the proposed method in generating in distribution triggers that can by-pass various defense strategies while maintaining a high attack success rate.
翻译:图神经网络(GNNs)在各种任务中展现出卓越的性能。然而,近期研究揭示GNNs易受后门攻击。通常,后门攻击通过在训练图中为一组节点附加后门触发器及目标类别标签来毒化图。在毒化图上训练的GNN随后会被误导,将附加触发器的测试节点预测为目标类别。尽管现有方法有效,我们的实证分析表明,其生成的触发器往往属于分布外(OOD),与干净数据存在显著差异。因此,这些注入的触发器在实际应用中极易被广泛使用的异常检测方法发现并移除。为此,本文研究一种具有分布内(ID)触发器的隐蔽图后门攻击新问题。为生成ID触发器,我们引入OOD检测器并结合对抗学习策略,以在分布内生成触发器的属性。为确保ID触发器的高攻击成功率,我们设计了新颖模块,旨在增强受害模型在毒化图上训练时对触发器的记忆能力。在真实数据集上的大量实验证明,所提方法能有效生成分布内触发器,在保持高攻击成功率的同时规避多种防御策略。