Federated learning (FL) enables distributed clients to collaboratively train a machine learning model without sharing raw data with each other. However, it suffers the leakage of private information from uploading models. In addition, as the model size grows, the training latency increases due to limited transmission bandwidth and the model performance degrades while using differential privacy (DP) protection. In this paper, we propose a gradient sparsification empowered FL framework over wireless channels, in order to improve training efficiency without sacrificing convergence performance. Specifically, we first design a random sparsification algorithm to retain a fraction of the gradient elements in each client's local training, thereby mitigating the performance degradation induced by DP and and reducing the number of transmission parameters over wireless channels. Then, we analyze the convergence bound of the proposed algorithm, by modeling a non-convex FL problem. Next, we formulate a time-sequential stochastic optimization problem for minimizing the developed convergence bound, under the constraints of transmit power, the average transmitting delay, as well as the client's DP requirement. Utilizing the Lyapunov drift-plus-penalty framework, we develop an analytical solution to the optimization problem. Extensive experiments have been implemented on three real life datasets to demonstrate the effectiveness of our proposed algorithm. We show that our proposed algorithms can fully exploit the interworking between communication and computation to outperform the baselines, i.e., random scheduling, round robin and delay-minimization algorithms.
翻译:联邦学习使分布式客户端能够在不共享原始数据的情况下协同训练机器学习模型。然而,通过上传模型参数会泄露隐私信息。此外,随着模型规模增大,有限传输带宽导致训练延迟增加,而采用差分隐私保护时模型性能会下降。本文提出一种基于无线信道的梯度稀疏化增强联邦学习框架,旨在不牺牲收敛性能的前提下提升训练效率。具体而言,我们首先设计随机稀疏化算法以保留各客户端本地训练中的部分梯度元素,从而缓解差分隐私导致的性能退化并减少无线信道传输参数数量。接着,通过建立非凸联邦学习模型分析所提算法的收敛界。随后,在发射功率、平均传输时延及客户端差分隐私需求约束下,构建最小化收敛界的时间序列随机优化问题。利用李雅普诺夫漂移加惩罚框架,我们推导出该优化问题的解析解。在三个真实数据集上的大量实验证明了所提算法的有效性。结果表明,我们的算法能够充分利用通信与计算协同作用,优于随机调度、轮询调度和延迟最小化算法等基线方法。