Federated learning (FL) enables multiple clients to collaboratively learn a shared model without sharing their individual data. Concerns about utility, privacy, and training efficiency in FL have garnered significant research attention. Differential privacy has emerged as a prevalent technique in FL, safeguarding the privacy of individual user data while impacting utility and training efficiency. Within Differential Privacy Federated Learning (DPFL), previous studies have primarily focused on the utility-privacy trade-off, neglecting training efficiency, which is crucial for timely completion. Moreover, differential privacy achieves privacy by introducing controlled randomness (noise) on selected clients in each communication round. Previous work has mainly examined the impact of noise level ($\sigma$) and communication rounds ($T$) on the privacy-utility dynamic, overlooking other influential factors like the sample ratio ($q$, the proportion of selected clients). This paper systematically formulates an efficiency-constrained utility-privacy bi-objective optimization problem in DPFL, focusing on $\sigma$, $T$, and $q$. We provide a comprehensive theoretical analysis, yielding analytical solutions for the Pareto front. Extensive empirical experiments verify the validity and efficacy of our analysis, offering valuable guidance for low-cost parameter design in DPFL.
翻译:联邦学习(FL)使多个客户端能够在不共享各自数据的情况下协作学习共享模型。关于FL中效用、隐私和训练效率的关切引起了广泛研究关注。差分隐私已成为FL中保护个体用户数据隐私同时影响效用和训练效率的主流技术。在差分隐私联邦学习(DPFL)中,先前研究主要关注效用-隐私权衡,忽略了训练效率这一对及时完成任务至关重要的因素。此外,差分隐私通过在每轮通信中向选定客户端引入受控随机性(噪声)来实现隐私保护。以往工作主要考察噪声水平($\sigma$)和通信轮数($T$)对隐私-效用动态的影响,忽视了样本率($q$,选定客户端的比例)等其他影响因素。本文系统性地构建了DPFL中效率约束下的效用-隐私双目标优化问题,聚焦于$\sigma$、$T$和$q$三个参数。我们提供了全面的理论分析,推导出帕累托前沿的解析解。大量实证实验验证了我们分析的有效性和实用性,为DPFL中的低成本参数设计提供了有价值的指导。