Graph Prompt Learning (GPL) has been introduced as a promising approach that uses prompts to adapt pre-trained GNN models to specific downstream tasks without requiring fine-tuning of the entire model. Despite the advantages of GPL, little attention has been given to its vulnerability to backdoor attacks, where an adversary can manipulate the model's behavior by embedding hidden triggers. Existing graph backdoor attacks rely on modifying model parameters during training, but this approach is impractical in GPL as GNN encoder parameters are frozen after pre-training. Moreover, downstream users may fine-tune their own task models on clean datasets, further complicating the attack. In this paper, we propose TGPA, a backdoor attack framework designed specifically for GPL. TGPA injects backdoors into graph prompts without modifying pre-trained GNN encoders and ensures high attack success rates and clean accuracy. To address the challenge of model fine-tuning by users, we introduce a finetuning-resistant poisoning approach that maintains the effectiveness of the backdoor even after downstream model adjustments. Extensive experiments on multiple datasets under various settings demonstrate the effectiveness of TGPA in compromising GPL models with fixed GNN encoders.
翻译:图提示学习(GPL)作为一种有前景的方法被提出,它利用提示来使预训练的图神经网络模型适应特定的下游任务,而无需对整个模型进行微调。尽管GPL具有诸多优势,但其对后门攻击的脆弱性却鲜受关注,在这种攻击中,攻击者可以通过嵌入隐藏触发器来操纵模型行为。现有的图后门攻击依赖于在训练期间修改模型参数,但这种方法在GPL中并不实用,因为图神经网络编码器参数在预训练后是冻结的。此外,下游用户可能在干净数据集上微调自己的任务模型,这进一步增加了攻击的复杂性。本文提出了TGPA,一个专门为GPL设计的后门攻击框架。TGPA将后门注入图提示中,无需修改预训练的图神经网络编码器,同时确保高攻击成功率和干净的准确率。为了应对用户进行模型微调的挑战,我们引入了一种抗微调的投毒方法,即使在下游模型调整后仍能保持后门的有效性。在多种设置下对多个数据集进行的广泛实验证明了TGPA在攻击具有固定图神经网络编码器的GPL模型方面的有效性。