Graph Neural Networks (GNNs) have become a popular tool for learning on graphs, but their widespread use raises privacy concerns as graph data can contain personal or sensitive information. Differentially private GNN models have been recently proposed to preserve privacy while still allowing for effective learning over graph-structured datasets. However, achieving an ideal balance between accuracy and privacy in GNNs remains challenging due to the intrinsic structural connectivity of graphs. In this paper, we propose a new differentially private GNN called ProGAP that uses a progressive training scheme to improve such accuracy-privacy trade-offs. Combined with the aggregation perturbation technique to ensure differential privacy, ProGAP splits a GNN into a sequence of overlapping submodels that are trained progressively, expanding from the first submodel to the complete model. Specifically, each submodel is trained over the privately aggregated node embeddings learned and cached by the previous submodels, leading to an increased expressive power compared to previous approaches while limiting the incurred privacy costs. We formally prove that ProGAP ensures edge-level and node-level privacy guarantees for both training and inference stages, and evaluate its performance on benchmark graph datasets. Experimental results demonstrate that ProGAP can achieve up to 5-10% higher accuracy than existing state-of-the-art differentially private GNNs. Our code is available at https://github.com/sisaman/ProGAP.
翻译:图神经网络(GNN)已成为在图结构数据上学习的热门工具,但其广泛使用引发了隐私担忧,因为图数据可能包含个人或敏感信息。近年来,为在保护隐私的同时有效学习图结构化数据集,研究者提出了差分隐私GNN模型。然而,由于图固有的结构连通性,在GNN中实现准确性与隐私性之间的理想平衡仍具挑战。本文提出一种名为ProGAP的新型差分隐私GNN,它采用渐进式训练方案来改善这种准确率-隐私权衡。结合聚合扰动技术以确保差分隐私,ProGAP将GNN分解为一系列逐步训练的、从前一个子模型扩展至完整模型的重叠子模型序列。具体而言,每个子模型基于前序子模型学习并缓存的私有聚合节点嵌入进行训练,从而在限制隐私成本的同时,相比先前方法获得更强的表达能力。我们形式化证明了ProGAP在训练和推理阶段均能提供边级和节点级隐私保证,并在基准图数据集上评估其性能。实验结果表明,与现有最先进的差分隐私GNN相比,ProGAP可实现高达5-10%的准确率提升。我们的代码公开于https://github.com/sisaman/ProGAP。