Towards Achieving Near-optimal Utility for Privacy-Preserving Federated Learning via Data Generation and Parameter Distortion

Federated learning (FL) enables participating parties to collaboratively build a global model with boosted utility without disclosing private data information. Appropriate protection mechanisms have to be adopted to fulfill the requirements in preserving \textit{privacy} and maintaining high model \textit{utility}. The nature of the widely-adopted protection mechanisms including \textit{Randomization Mechanism} and \textit{Compression Mechanism} is to protect privacy via distorting model parameter. We measure the utility via the gap between the original model parameter and the distorted model parameter. We want to identify under what general conditions privacy-preserving federated learning can achieve near-optimal utility via data generation and parameter distortion. To provide an avenue for achieving near-optimal utility, we present an upper bound for utility loss, which is measured using two main terms called variance-reduction and model parameter discrepancy separately. Our analysis inspires the design of appropriate protection parameters for the protection mechanisms to achieve near-optimal utility and meet the privacy requirements simultaneously. The main techniques for the protection mechanism include parameter distortion and data generation, which are generic and can be applied extensively. Furthermore, we provide an upper bound for the trade-off between privacy and utility, \blue{which together with the lower bound provided by no free lunch theorem in federated learning (\cite{zhang2022no}) form the conditions for achieving optimal trade-off.

翻译：联邦学习（FL）使得参与方能够在不泄露私有数据信息的情况下，协作构建一个具有提升效用的全局模型。必须采用适当的保护机制来满足保护隐私和保持高模型效用的要求。广泛采用的保护机制包括随机化机制和压缩机制，其本质是通过扰动模型参数来保护隐私。我们通过原始模型参数与扰动后模型参数之间的差距来衡量效用。我们旨在探究在何种通用条件下，隐私保护联邦学习能够通过数据生成和参数扰动实现近最优效用。为提供实现近最优效用的途径，我们给出了效用损失的上界，该上界分别用方差缩减和模型参数差异这两个主要项来衡量。我们的分析启发设计了合适的保护参数，使保护机制在满足隐私需求的同时实现近最优效用。保护机制的主要技术包括参数扰动和数据生成，这些技术具有通用性且可广泛适用。此外，我们提供了隐私与效用权衡的上界，该上界与联邦学习中无免费午餐定理提供的下界共同构成了实现最优权衡的条件。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日