Differential privacy (DP) has seen immense applications in learning on tabular, image, and sequential data where instance-level privacy is concerned. In learning on graphs, contrastingly, works on node-level privacy are highly sparse. Challenges arise as existing DP protocols hardly apply to the message-passing mechanism in Graph Neural Networks (GNNs). In this study, we propose a solution that specifically addresses the issue of node-level privacy. Our protocol consists of two main components: 1) a sampling routine called HeterPoisson, which employs a specialized node sampling strategy and a series of tailored operations to generate a batch of sub-graphs with desired properties, and 2) a randomization routine that utilizes symmetric multivariate Laplace (SML) noise instead of the commonly used Gaussian noise. Our privacy accounting shows this particular combination provides a non-trivial privacy guarantee. In addition, our protocol enables GNN learning with good performance, as demonstrated by experiments on five real-world datasets; compared with existing baselines, our method shows significant advantages, especially in the high privacy regime. Experimentally, we also 1) perform membership inference attacks against our protocol and 2) apply privacy audit techniques to confirm our protocol's privacy integrity. In the sequel, we present a study on a seemingly appealing approach \cite{sajadmanesh2023gap} (USENIX'23) that protects node-level privacy via differentially private node/instance embeddings. Unfortunately, such work has fundamental privacy flaws, which are identified through a thorough case study. More importantly, we prove an impossibility result of achieving both (strong) privacy and (acceptable) utility through private instance embedding. The implication is that such an approach has intrinsic utility barriers when enforcing differential privacy.
翻译:差分隐私在涉及实例级隐私的表格、图像和序列数据学习中已有广泛应用。然而,在图学习中,针对节点级隐私的研究却极为稀少。现有差分隐私协议难以适用于图神经网络中的消息传递机制,这构成了主要挑战。本研究提出一种专门解决节点级隐私问题的方案。该协议包含两个核心组件:1)名为HeterPoisson的采样程序,通过采用专用节点采样策略及一系列定制操作生成具有所需属性的子图批次;2)随机化程序,使用对称多元拉普拉斯噪声替代常用的高斯噪声。我们的隐私核算表明,这种特定组合能够提供非平凡的隐私保证。此外,该协议能在保持良好性能的前提下实现图神经网络学习——在五个真实数据集上的实验表明,与现有基线相比,我们的方法在高隐私保护机制下展现出显著优势。实验方面,我们还进行了:1)针对协议的身份推断攻击测试;2)应用隐私审计技术确认协议隐私完整性。随后,我们针对《USENIX'23》中一篇看似可行的方案进行了研究,该方案通过差分隐私节点/实例嵌入保护节点级隐私。遗憾的是,通过全面案例研究,我们发现该工作存在根本性隐私缺陷。更重要的是,我们证明了通过私有实例嵌入同时实现(强)隐私与(可接受)效用的不可行性。这意味着此类方法在实施差分隐私时存在固有的效用障碍。