Provable Privacy Advantages of Decentralized Federated Learning via Distributed Optimization

Federated learning (FL) emerged as a paradigm designed to improve data privacy by enabling data to reside at its source, thus embedding privacy as a core consideration in FL architectures, whether centralized or decentralized. Contrasting with recent findings by Pasquini et al., which suggest that decentralized FL does not empirically offer any additional privacy or security benefits over centralized models, our study provides compelling evidence to the contrary. We demonstrate that decentralized FL, when deploying distributed optimization, provides enhanced privacy protection - both theoretically and empirically - compared to centralized approaches. The challenge of quantifying privacy loss through iterative processes has traditionally constrained the theoretical exploration of FL protocols. We overcome this by conducting a pioneering in-depth information-theoretical privacy analysis for both frameworks. Our analysis, considering both eavesdropping and passive adversary models, successfully establishes bounds on privacy leakage. We show information theoretically that the privacy loss in decentralized FL is upper bounded by the loss in centralized FL. Compared to the centralized case where local gradients of individual participants are directly revealed, a key distinction of optimization-based decentralized FL is that the relevant information includes differences of local gradients over successive iterations and the aggregated sum of different nodes' gradients over the network. This information complicates the adversary's attempt to infer private data. To bridge our theoretical insights with practical applications, we present detailed case studies involving logistic regression and deep neural networks. These examples demonstrate that while privacy leakage remains comparable in simpler models, complex models like deep neural networks exhibit lower privacy risks under decentralized FL.

翻译：联邦学习（FL）作为一种旨在通过使数据保留在源端来提升数据隐私的范式而出现，从而将隐私作为FL架构（无论是中心化还是去中心化）的核心考量。与Pasquini等人近期提出的、认为去中心化FL在经验上并未比中心化模型提供额外隐私或安全优势的研究结论相反，我们的研究提供了令人信服的证据支持相反观点。我们证明，在部署分布式优化时，去中心化FL相比中心化方法，在理论和经验上均能提供增强的隐私保护。通过迭代过程量化隐私损失的挑战，传统上限制了FL协议的理论探索。我们通过为两种框架进行开创性的深度信息论隐私分析来克服这一挑战。我们的分析考虑了窃听和被动敌手模型，成功建立了隐私泄露的上界。我们从信息论角度证明，去中心化FL中的隐私损失上界为中心化FL中的损失上界。与中心化情况下直接暴露个体参与者的局部梯度不同，基于优化的去中心化FL的一个关键区别在于，相关信息包括连续迭代间局部梯度的差值以及网络中不同节点梯度的聚合和。这一信息增加了敌手推断私有数据的难度。为了将我们的理论洞见与实际应用相结合，我们提供了涉及逻辑回归和深度神经网络的详细案例研究。这些示例表明，虽然在较简单的模型中隐私泄露程度相当，但在去中心化FL下，深度神经网络等复杂模型展现出更低的隐私风险。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日