Graph Neural Networks have achieved tremendous success in modeling complex graph data in a variety of applications. However, there are limited studies investigating privacy protection in GNNs. In this work, we propose a learning framework that can provide node privacy at the user level, while incurring low utility loss. We focus on a decentralized notion of Differential Privacy, namely Local Differential Privacy, and apply randomization mechanisms to perturb both feature and label data at the node level before the data is collected by a central server for model training. Specifically, we investigate the application of randomization mechanisms in high-dimensional feature settings and propose an LDP protocol with strict privacy guarantees. Based on frequency estimation in statistical analysis of randomized data, we develop reconstruction methods to approximate features and labels from perturbed data. We also formulate this learning framework to utilize frequency estimates of graph clusters to supervise the training procedure at a sub-graph level. Extensive experiments on real-world and semi-synthetic datasets demonstrate the validity of our proposed model.
翻译:图神经网络在各类应用中成功建模复杂图数据取得了巨大成就。然而,目前关于图神经网络中隐私保护的研究仍十分有限。本文提出一种学习框架,可在用户层面提供节点隐私保护,同时仅产生较低的效用损失。我们聚焦于差分隐私的分布式概念——本地差分隐私,在数据被中央服务器收集用于模型训练之前,对节点层面的特征和标签数据应用随机化机制进行扰动。具体而言,我们研究了高维特征环境下的随机化机制应用,并提出了一种具有严格隐私保障的LDP协议。基于随机化数据统计分析中的频率估计方法,我们开发了从扰动数据中重构特征和标签的重构方法。我们还构建了利用图簇频率估计来监督子图层面训练过程的学习框架。在真实数据集和半合成数据集上的大量实验证明了所提模型的有效性。