Differentially Private Federated Learning (DP-FL) has garnered attention as a collaborative machine learning approach that ensures formal privacy. Most DP-FL approaches ensure DP at the record-level within each silo for cross-silo FL. However, a single user's data may extend across multiple silos, and the desired user-level DP guarantee for such a setting remains unknown. In this study, we present Uldp-FL, a novel FL framework designed to guarantee user-level DP in cross-silo FL where a single user's data may belong to multiple silos. Our proposed algorithm directly ensures user-level DP through per-user weighted clipping, departing from group-privacy approaches. We provide a theoretical analysis of the algorithm's privacy and utility. Additionally, we enhance the utility of the proposed algorithm with an enhanced weighting strategy based on user record distribution and design a novel private protocol that ensures no additional information is revealed to the silos and the server. Experiments on real-world datasets show substantial improvements in our methods in privacy-utility trade-offs under user-level DP compared to baseline methods. To the best of our knowledge, our work is the first FL framework that effectively provides user-level DP in the general cross-silo FL setting.
翻译:差分隐私联邦学习(DP-FL)作为一种能够确保形式化隐私的协作式机器学习方法,已受到广泛关注。大多数DP-FL方法在跨孤岛联邦学习中,针对每个孤岛内部的数据实现记录级差分隐私保护。然而,单个用户的数据可能分布在多个孤岛中,而针对此类场景所需的用户级差分隐私保障机制尚未明确。本研究提出ULDP-FL,一种新颖的联邦学习框架,旨在为跨孤岛联邦学习场景(其中单个用户的数据可能属于多个孤岛)提供用户级差分隐私保障。我们提出的算法通过基于每用户加权裁剪的方式直接实现用户级差分隐私,而非采用群组隐私方法。我们对算法的隐私性与效用性进行了理论分析。此外,我们基于用户记录分布设计了一种增强的加权策略,以提升所提算法的效用,并设计了一种新颖的隐私协议,确保不会向孤岛或服务器泄露任何额外信息。在真实数据集上的实验表明,在用户级差分隐私约束下,我们的方法在隐私-效用权衡方面相比基线方法有显著提升。据我们所知,我们的工作是首个能够在通用跨孤岛联邦学习场景中有效提供用户级差分隐私保护的联邦学习框架。