Federated learning (FL) enables multiple clients to collaboratively learn a shared model without sharing their individual data. Concerns about utility, privacy, and training efficiency in FL have garnered significant research attention. Differential privacy has emerged as a prevalent technique in FL, safeguarding the privacy of individual user data while impacting utility and training efficiency. Within Differential Privacy Federated Learning (DPFL), previous studies have primarily focused on the utility-privacy trade-off, neglecting training efficiency, which is crucial for timely completion. Moreover, differential privacy achieves privacy by introducing controlled randomness (noise) on selected clients in each communication round. Previous work has mainly examined the impact of noise level ($\sigma$) and communication rounds ($T$) on the privacy-utility dynamic, overlooking other influential factors like the sample ratio ($q$, the proportion of selected clients). This paper systematically formulates an efficiency-constrained utility-privacy bi-objective optimization problem in DPFL, focusing on $\sigma$, $T$, and $q$. We provide a comprehensive theoretical analysis, yielding analytical solutions for the Pareto front. Extensive empirical experiments verify the validity and efficacy of our analysis, offering valuable guidance for low-cost parameter design in DPFL.
翻译:联邦学习(FL)使多个客户端能够在不共享各自数据的情况下协作学习共享模型。FL中的效用、隐私和训练效率问题已引起广泛研究关注。差分隐私已成为FL中的主流技术,它在保护单个用户数据隐私的同时影响效用和训练效率。在差分隐私联邦学习(DPFL)中,以往研究主要关注效用-隐私权衡,忽视了对于及时完成至关重要的训练效率。此外,差分隐私通过在每个通信轮次中向选定的客户端引入受控随机性(噪声)来实现隐私保护。以往工作主要考察了噪声水平($\sigma$)和通信轮次($T$)对隐私-效用动态的影响,忽略了样本比例($q$,即选定客户端的比例)等其他影响因素。本文系统性地构建了DPFL中受效率约束的效用-隐私双目标优化问题,重点关注$\sigma$、$T$和$q$三个参数。我们提供了全面的理论分析,导出了帕累托前沿的解析解。大量实证实验验证了我们分析的有效性和正确性,为DPFL中的低成本参数设计提供了有价值的指导。