GRID: Protecting Training Graph from Link Stealing Attacks on GNN Models

Graph neural networks (GNNs) have exhibited superior performance in various classification tasks on graph-structured data. However, they encounter the potential vulnerability from the link stealing attacks, which can infer the presence of a link between two nodes via measuring the similarity of its incident nodes' prediction vectors produced by a GNN model. Such attacks pose severe security and privacy threats to the training graph used in GNN models. In this work, we propose a novel solution, called Graph Link Disguise (GRID), to defend against link stealing attacks with the formal guarantee of GNN model utility for retaining prediction accuracy. The key idea of GRID is to add carefully crafted noises to the nodes' prediction vectors for disguising adjacent nodes as n-hop indirect neighboring nodes. We take into account the graph topology and select only a subset of nodes (called core nodes) covering all links for adding noises, which can avert the noises offset and have the further advantages of reducing both the distortion loss and the computation cost. Our crafted noises can ensure 1) the noisy prediction vectors of any two adjacent nodes have their similarity level like that of two non-adjacent nodes and 2) the model prediction is unchanged to ensure zero utility loss. Extensive experiments on five datasets are conducted to show the effectiveness of our proposed GRID solution against different representative link-stealing attacks under transductive settings and inductive settings respectively, as well as two influence-based attacks. Meanwhile, it achieves a much better privacy-utility trade-off than existing methods when extended to GNNs.

翻译：图神经网络（GNNs）在处理图结构数据的各类分类任务中展现出卓越性能。然而，它们面临着链接窃取攻击的潜在威胁，此类攻击可通过测量GNN模型生成的相邻节点预测向量之间的相似性，推断两个节点间是否存在链接。这类攻击对GNN模型所使用的训练图构成严重的安全与隐私威胁。本研究提出一种名为图链接伪装（GRID）的新型防御方案，可在保证GNN模型预测精度的前提下，为抵御链接窃取攻击提供形式化保障。GRID的核心思想是通过向节点预测向量添加精心设计的噪声，将相邻节点伪装成n跳间接邻接节点。我们综合考虑图拓扑结构，仅选择覆盖所有链接的节点子集（称为核心节点）施加噪声，既可避免噪声相互抵消，又能有效降低失真损失与计算成本。所设计的噪声可确保：1）任意两个相邻节点的含噪预测向量具有与非相邻节点相似的相似度水平；2）模型预测保持不变，实现零效用损失。我们在五个数据集上进行了大量实验，分别验证了所提GRID方案在直推式设置与归纳式设置下抵御不同代表性链接窃取攻击的有效性，以及对抗两种基于影响力攻击的防御能力。同时，当扩展至GNNs时，该方案在隐私-效用权衡方面显著优于现有方法。