Federated learning (FL) emerged as a paradigm designed to improve data privacy by enabling data to reside at its source, thus embedding privacy as a core consideration in FL architectures, whether centralized or decentralized. Contrasting with recent findings by Pasquini et al., which suggest that decentralized FL does not empirically offer any additional privacy or security benefits over centralized models, our study provides compelling evidence to the contrary. We demonstrate that decentralized FL, when deploying distributed optimization, provides enhanced privacy protection - both theoretically and empirically - compared to centralized approaches. The challenge of quantifying privacy loss through iterative processes has traditionally constrained the theoretical exploration of FL protocols. We overcome this by conducting a pioneering in-depth information-theoretical privacy analysis for both frameworks. Our analysis, considering both eavesdropping and passive adversary models, successfully establishes bounds on privacy leakage. We show information theoretically that the privacy loss in decentralized FL is upper bounded by the loss in centralized FL. Compared to the centralized case where local gradients of individual participants are directly revealed, a key distinction of optimization-based decentralized FL is that the relevant information includes differences of local gradients over successive iterations and the aggregated sum of different nodes' gradients over the network. This information complicates the adversary's attempt to infer private data. To bridge our theoretical insights with practical applications, we present detailed case studies involving logistic regression and deep neural networks. These examples demonstrate that while privacy leakage remains comparable in simpler models, complex models like deep neural networks exhibit lower privacy risks under decentralized FL.
翻译:联邦学习(FL)作为一种旨在通过使数据保留在源端来提升数据隐私的范式而出现,从而将隐私作为FL架构(无论是中心化还是去中心化)的核心考量。与Pasquini等人近期提出的、认为去中心化FL在经验上并未比中心化模型提供额外隐私或安全优势的研究结论相反,我们的研究提供了令人信服的证据支持相反观点。我们证明,在部署分布式优化时,去中心化FL相比中心化方法,在理论和经验上均能提供增强的隐私保护。通过迭代过程量化隐私损失的挑战,传统上限制了FL协议的理论探索。我们通过为两种框架进行开创性的深度信息论隐私分析来克服这一挑战。我们的分析考虑了窃听和被动敌手模型,成功建立了隐私泄露的上界。我们从信息论角度证明,去中心化FL中的隐私损失上界为中心化FL中的损失上界。与中心化情况下直接暴露个体参与者的局部梯度不同,基于优化的去中心化FL的一个关键区别在于,相关信息包括连续迭代间局部梯度的差值以及网络中不同节点梯度的聚合和。这一信息增加了敌手推断私有数据的难度。为了将我们的理论洞见与实际应用相结合,我们提供了涉及逻辑回归和深度神经网络的详细案例研究。这些示例表明,虽然在较简单的模型中隐私泄露程度相当,但在去中心化FL下,深度神经网络等复杂模型展现出更低的隐私风险。