Graph Neural Networks (GNNs) have advanced the field of machine learning by utilizing graph-structured data, which is ubiquitous in the real world. GNNs have applications in various fields, ranging from social network analysis to drug discovery. GNN training is strenuous, requiring significant computational resources and human expertise. It makes a trained GNN an indispensable Intellectual Property (IP) for its owner. Recent studies have shown GNNs to be vulnerable to model-stealing attacks, which raises concerns over IP rights protection. Watermarking has been shown to be effective at protecting the IP of a GNN model. Existing efforts to develop a watermarking scheme for GNNs have only focused on the node classification and the graph classification tasks. To the best of our knowledge, we introduce the first-ever watermarking scheme for GNNs tailored to the Link Prediction (LP) task. We call our proposed watermarking scheme GENIE (watermarking Graph nEural Networks for lInk prEdiction). We design GENIE using a novel backdoor attack to create a trigger set for two key methods of LP: (1) node representation-based and (2) subgraph-based. In GENIE, the watermark is embedded into the GNN model by training it on both the trigger set and a modified training set, resulting in a watermarked GNN model. To assess a suspect model, we verify the watermark against the trigger set. We extensively evaluate GENIE across 3 model architectures (i.e., SEAL, GCN, and GraphSAGE) and 7 real-world datasets. Furthermore, we validate the robustness of GENIE against 11 state-of-the-art watermark removal techniques and 3 model extraction attacks. We also demonstrate that GENIE is robust against ownership piracy attack. Our ownership demonstration scheme statistically guarantees both False Positive Rate (FPR) and False Negative Rate (FNR) to be less than $10^{-6}$.
翻译:图神经网络(GNN)通过利用现实世界中广泛存在的图结构数据,推动了机器学习领域的发展。GNN已在社交网络分析到药物发现等多个领域得到应用。GNN的训练过程耗费巨大,需要大量计算资源和人类专业知识,这使得训练完成的GNN成为其所有者不可或缺的知识产权(IP)。近期研究表明GNN易受模型窃取攻击,这引发了对其知识产权保护的担忧。水印技术已被证明能有效保护GNN模型的IP。现有GNN水印方案仅关注节点分类和图分类任务。据我们所知,我们首次提出了针对链接预测(LP)任务的GNN水印方案,并将其命名为GENIE(面向链接预测的图神经网络水印)。GENIE采用新型后门攻击设计,为LP的两种关键方法构建触发集:(1)基于节点表示的方法和(2)基于子图的方法。通过同时在触发集和修改后的训练集上训练GNN模型,我们将水印嵌入模型中,最终得到带有水印的GNN模型。对于可疑模型,我们通过触发集验证水印。我们在3种模型架构(即SEAL、GCN和GraphSAGE)及7个真实数据集上对GENIE进行了全面评估。此外,我们验证了GENIE对11种最先进的水印移除技术和3种模型提取攻击的鲁棒性。我们还证明了GENIE能抵御所有权盗用攻击。我们的所有权证明方案在统计上保证假阳性率(FPR)和假阴性率(FNR)均低于$10^{-6}$。