Differentially Private Relational Learning with Entity-level Privacy Guarantees

Learning with relational and network-structured data is increasingly vital in sensitive domains where protecting the privacy of individual entities is paramount. Differential Privacy (DP) offers a principled approach for quantifying privacy risks, with DP-SGD emerging as a standard mechanism for private model training. However, directly applying DP-SGD to relational learning is challenging due to two key factors: (i) entities often participate in multiple relations, resulting in high and difficult-to-control sensitivity; and (ii) relational learning typically involves multi-stage, potentially coupled (interdependent) sampling procedures that make standard privacy amplification analyses inapplicable. This work presents a principled framework for relational learning with formal entity-level DP guarantees. We provide a rigorous sensitivity analysis and introduce an adaptive gradient clipping scheme that modulates clipping thresholds based on entity occurrence frequency. We also extend the privacy amplification results to a tractable subclass of coupled sampling, where the dependence arises only through sample sizes. These contributions lead to a tailored DP-SGD variant for relational data with provable privacy guarantees. Experiments on fine-tuning text encoders over text-attributed network-structured relational data demonstrate the strong utility-privacy trade-offs of our approach. Our code is available at https://github.com/Graph-COM/Node_DP.

翻译：在敏感领域中，利用关系和网络结构数据进行学习日益重要，保护个体实体的隐私至关重要。差分隐私（DP）为量化隐私风险提供了原则性方法，其中DP-SGD已成为私有模型训练的标准机制。然而，由于两个关键因素，将DP-SGD直接应用于关系学习具有挑战性：（i）实体通常参与多重关系，导致敏感度高且难以控制；（ii）关系学习通常涉及多阶段、可能耦合（相互依赖）的采样过程，使得标准隐私放大分析不再适用。本研究提出了一个具有形式化实体级DP保证的关系学习原则性框架。我们提供了严格的敏感性分析，并引入了一种自适应梯度裁剪方案，该方案根据实体出现频率调整裁剪阈值。我们还将隐私放大结果扩展到一类可处理的耦合采样子类，其中依赖性仅通过样本量产生。这些贡献催生了一种针对关系数据的定制化DP-SGD变体，并具有可证明的隐私保证。在文本属性网络结构关系数据上对文本编码器进行微调的实验表明，我们的方法实现了优异的效用-隐私权衡。代码发布于https://github.com/Graph-COM/Node_DP。

相关内容

实体

关注 12

实体（entity）是有可区别性且独立存在的某种事物，但它不需要是物质上的存在。尤其是抽象和法律拟制也通常被视为实体。实体可被看成是一包含有子集的集合。在哲学里，这种集合被称为客体。实体可被使用来指涉某个可能是人、动物、植物或真菌等不会思考的生命、无生命物体或信念等的事物。在这一方面，实体可以被视为一全包的词语。有时，实体被当做本质的广义，不论即指的是否为物质上的存在，如时常会指涉到的无物质形式的实体－语言。更有甚者，实体有时亦指存在或本质本身。在法律上，实体是指能具有权利和义务的事物。这通常是指法人，但也包括自然人。

差分隐私全指南：从理论基础到用户期望

专知会员服务

13+阅读 · 2025年9月8日

【新书】差分隐私，246页pdf

专知会员服务

27+阅读 · 2025年4月5日

【普林斯顿博士论文】在差分隐私机器学习中有效地从数据中学习并生成数据，189页pdf

专知会员服务

20+阅读 · 2024年10月18日

【普林斯顿博士论文】在差分隐私机器学习中有效地从数据中学习和生成数据

专知会员服务

16+阅读 · 2024年10月7日