Graph Neural Networks (GNNs) have achieved remarkable results in various tasks. Recent studies reveal that graph backdoor attacks can poison the GNN model to predict test nodes with triggers attached as the target class. However, apart from injecting triggers to training nodes, these graph backdoor attacks generally require altering the labels of trigger-attached training nodes into the target class, which is impractical in real-world scenarios. In this work, we focus on the clean-label graph backdoor attack, a realistic but understudied topic where training labels are not modifiable. According to our preliminary analysis, existing graph backdoor attacks generally fail under the clean-label setting. Our further analysis identifies that the core failure of existing methods lies in their inability to poison the prediction logic of GNN models, leading to the triggers being deemed unimportant for prediction. Therefore, we study a novel problem of effective clean-label graph backdoor attacks by poisoning the inner prediction logic of GNN models. We propose BA-Logic to solve the problem by coordinating a poisoned node selector and a logic-poisoning trigger generator. Extensive experiments on real-world datasets demonstrate that our method effectively enhances the attack success rate and surpasses state-of-the-art graph backdoor attack competitors under clean-label settings. Our code is available at https://anonymous.4open.science/r/BA-Logic
翻译:图神经网络(GNN)已在多项任务中取得显著成果。近期研究表明,图后门攻击能够通过注入触发器,使被攻击的GNN模型将带有触发器的测试节点预测为目标类别。然而,除向训练节点注入触发器外,现有图后门攻击通常需要将带有触发器的训练节点标签篡改为目标类别,这在真实场景中难以实现。本文聚焦于一种更贴近实际但研究不足的课题——基于干净标签的图后门攻击,其中训练标签不可修改。初步分析表明,现有图后门攻击在干净标签设定下基本失效。进一步研究发现,现有方法的核心缺陷在于无法污染GNN模型的预测逻辑,导致触发器被模型判定为对预测不重要。为此,我们研究通过污染GNN模型内部预测逻辑以实现有效干净标签图后门攻击的新问题。我们提出BA-Logic方法,通过协同设计污染节点选择器与逻辑污染触发器生成器来解决该问题。在真实数据集上的大量实验表明,本方法在干净标签设定下显著提升了攻击成功率,并超越了当前最先进的图后门攻击方案。我们的代码已开源:https://anonymous.4open.science/r/BA-Logic