Federated learning (FL) enables participating parties to collaboratively build a global model with boosted utility without disclosing private data information. Appropriate protection mechanisms have to be adopted to fulfill the requirements in preserving \textit{privacy} and maintaining high model \textit{utility}. The nature of the widely-adopted protection mechanisms including \textit{Randomization Mechanism} and \textit{Compression Mechanism} is to protect privacy via distorting model parameter. We measure the utility via the gap between the original model parameter and the distorted model parameter. We want to identify under what general conditions privacy-preserving federated learning can achieve near-optimal utility via data generation and parameter distortion. To provide an avenue for achieving near-optimal utility, we present an upper bound for utility loss, which is measured using two main terms called variance-reduction and model parameter discrepancy separately. Our analysis inspires the design of appropriate protection parameters for the protection mechanisms to achieve near-optimal utility and meet the privacy requirements simultaneously. The main techniques for the protection mechanism include parameter distortion and data generation, which are generic and can be applied extensively. Furthermore, we provide an upper bound for the trade-off between privacy and utility, which together with the lower bound illustrated in NFL form the conditions for achieving optimal trade-off.
翻译:联邦学习(FL)使参与方能够协同构建一个增强效用的全局模型,同时无需泄露私有数据信息。为满足隐私保护与维持高模型效用的需求,必须采用适当的保护机制。广泛采用的保护机制(包括随机化机制和压缩机制)的本质是通过扰动模型参数来保护隐私。我们通过原始模型参数与扰动后模型参数之间的差距来衡量效用。本文旨在探究在何种一般条件下,通过数据生成与参数扰动可使隐私保护联邦学习实现近似最优效用。为实现近似最优效用,我们提出了效用损失的上界,该上界分别由方差缩减和模型参数差异两个主要项来度量。这一分析启发我们设计合适的保护机制参数,使模型在满足隐私要求的同时实现近似最优效用。保护机制涉及的核心技术包括参数扰动与数据生成,这些技术具有通用性且可广泛适用。此外,我们还给出了隐私与效用权衡的上界,该上界与NFL(不存在免费午餐定理)中阐述的下界共同构成了实现最优权衡的条件。