In recent years, with the rapid development of graph neural networks (GNN), more and more graph datasets have been published for GNN tasks. However, when an upstream data owner publishes graph data, there are often many privacy concerns, because many real-world graph data contain sensitive information like person's friend list. Differential privacy (DP) is a common method to protect privacy, but due to the complex topological structure of graph data, applying DP on graphs often affects the message passing and aggregation of GNN models, leading to a decrease in model accuracy. In this paper, we propose a novel graph edge protection framework, graph publisher (GraphPub), which can protect graph topology while ensuring that the availability of data is basically unchanged. Through reverse learning and the encoder-decoder mechanism, we search for some false edges that do not have a large negative impact on the aggregation of node features, and use them to replace some real edges. The modified graph will be published, which is difficult to distinguish between real and false data. Sufficient experiments prove that our framework achieves model accuracy close to the original graph with an extremely low privacy budget.
翻译:近年来,随着图神经网络(GNN)的快速发展,越来越多的图数据集被发布用于GNN任务。然而,当上游数据所有者发布图数据时,常常面临诸多隐私问题,因为许多真实世界的图数据包含敏感信息,例如个人的好友列表。差分隐私(DP)是保护隐私的常用方法,但由于图数据复杂的拓扑结构,在图数据上应用差分隐私常常会影响GNN模型的消息传递和聚合过程,导致模型精度下降。本文提出了一种新颖的图边保护框架——图发布器(GraphPub),该框架能够在保护图拓扑结构的同时,确保数据可用性基本不变。通过反向学习和编码器-解码器机制,我们寻找一些对节点特征聚合影响较小的虚假边,并用它们替代部分真实边。修改后的图将被发布,使得真实数据与虚假数据难以区分。充分的实验证明,我们的框架能够在极低的隐私预算下,实现接近原始图的模型精度。