Federated learning (FL) is increasingly deployed among multiple clients (e.g., mobile devices) to train a shared model over decentralized data. To address the privacy concerns, FL systems need to protect the clients' data from being revealed during training, and also control the data leakage through trained models when exposed to untrusted domains. Distributed differential privacy (DP) offers an appealing solution in this regard as it achieves an informed tradeoff between privacy and utility without a trusted server. However, existing distributed DP mechanisms work impractically in the presence of client dropout, resulting in either poor privacy guarantees or degraded training accuracy. In addition, these mechanisms also suffer from severe efficiency issues with long time-to-accuracy training performance. We present Hyades, a distributed differentially private FL framework that is highly efficient and resilient to client dropout. Specifically, we develop a novel 'add-then-remove' scheme where a required noise level can be enforced in each FL training round even though some sampled clients may drop out in the end; therefore, the privacy budget is consumed carefully even in the presence of client dropout. To boost performance, Hyades runs as a distributed pipeline architecture via encapsulating the communication and computation operations into stages. It automatically divides the global model aggregation into several chunk-aggregation tasks and pipelines them for optimal speedup. Evaluation through large-scale cloud deployment shows that Hyades can efficiently handle client dropout in various realistic FL scenarios, attaining the optimal privacy-utility tradeoff and accelerating the training by up to 2.1$\times$ compared to existing solutions.
翻译:联邦学习(FL)正越来越多地部署于多个客户端(如移动设备),以在分散数据上训练共享模型。为应对隐私问题,联邦学习系统需在训练过程中保护客户端数据不被泄露,并控制通过训练模型向不可信域暴露时的数据泄漏。分布式差分隐私(DP)为此提供了理想方案,它能在无需可信服务器的情况下实现隐私与效用的知情权衡。然而,现有分布式差分隐私机制在客户端丢失场景下不切实际,导致隐私保障差或训练精度下降。此外,这些机制还存在严重效率问题,训练收敛缓慢。我们提出Hyades——一种高效且对客户端丢失具有鲁棒性的分布式差分隐私联邦学习框架。具体而言,我们开发了新颖的"先加后删"方案,即使部分采样客户端最终丢失,仍能在每轮FL训练中强制执行所需噪声水平;因此,即便存在客户端丢失,隐私预算也能得到精细管理。为提升性能,Hyades通过将通信与计算操作封装为阶段,构建分布式流水线架构。它自动将全局模型聚合分解为多个分块聚合任务,并通过流水线实现最优加速。基于大规模云部署的评估表明,Hyades能在多种实际FL场景中高效应对客户端丢失,实现最优隐私-效用权衡,并将训练速度相比现有方案提升高达2.1倍。