Graph Neural Networks (GNNs), specifically designed to process the graph data, have achieved remarkable success in various applications. Link stealing attacks on graph data pose a significant privacy threat, as attackers aim to extract sensitive relationships between nodes (entities), potentially leading to academic misconduct, fraudulent transactions, or other malicious activities. Previous studies have primarily focused on single datasets and did not explore cross-dataset attacks, let alone attacks that leverage the combined knowledge of multiple attackers. However, we find that an attacker can combine the data knowledge of multiple attackers to create a more effective attack model, which can be referred to cross-dataset attacks. Moreover, if knowledge can be extracted with the help of Large Language Models (LLMs), the attack capability will be more significant. In this paper, we propose a novel link stealing attack method that takes advantage of cross-dataset and Large Language Models (LLMs). The LLM is applied to process datasets with different data structures in cross-dataset attacks. Each attacker fine-tunes the LLM on their specific dataset to generate a tailored attack model. We then introduce a novel model merging method to integrate the parameters of these attacker-specific models effectively. The result is a merged attack model with superior generalization capabilities, enabling effective attacks not only on the attackers' datasets but also on previously unseen (out-of-domain) datasets. We conducted extensive experiments in four datasets to demonstrate the effectiveness of our method. Additional experiments with three different GNN and LLM architectures further illustrate the generality of our approach.
翻译:图神经网络(GNNs)是专门为处理图数据而设计的模型,已在众多应用中取得了显著成功。针对图数据的链接窃取攻击构成了严重的隐私威胁,攻击者旨在提取节点(实体)间的敏感关系,这可能导致学术不端、欺诈交易或其他恶意活动。先前的研究主要集中于单一数据集,并未探索跨数据集攻击,更未涉及利用多个攻击者综合知识的攻击方式。然而,我们发现攻击者可以整合多个攻击者的数据知识,构建出更有效的攻击模型,这可称为跨数据集攻击。此外,若能在大型语言模型(LLMs)的辅助下提取知识,攻击能力将更为显著。本文提出了一种新颖的链接窃取攻击方法,该方法利用了跨数据集和大型语言模型(LLMs)的优势。在跨数据集攻击中,LLM被用于处理具有不同数据结构的数据集。每位攻击者在其特定数据集上对LLM进行微调,以生成定制化的攻击模型。随后,我们引入了一种新颖的模型融合方法,以有效整合这些攻击者特定模型的参数。最终得到一个具有卓越泛化能力的融合攻击模型,该模型不仅能在攻击者的数据集上实施有效攻击,还能对先前未见(域外)的数据集发起攻击。我们在四个数据集上进行了大量实验,以验证所提方法的有效性。此外,采用三种不同GNN和LLM架构的补充实验进一步证明了本方法的普适性。