Federated Learning (FL), a privacy-oriented distributed ML paradigm, is being gaining great interest in Internet of Things because of its capability to protect participants data privacy. Studies have been conducted to address challenges existing in standard FL, including communication efficiency and privacy-preserving. But they cannot achieve the goal of making a tradeoff between communication efficiency and model accuracy while guaranteeing privacy. This paper proposes a Conditional Random Sampling (CRS) method and implements it into the standard FL settings (CRS-FL) to tackle the above-mentioned challenges. CRS explores a stochastic coefficient based on Poisson sampling to achieve a higher probability of obtaining zero-gradient unbiasedly, and then decreases the communication overhead effectively without model accuracy degradation. Moreover, we dig out the relaxation Local Differential Privacy (LDP) guarantee conditions of CRS theoretically. Extensive experiment results indicate that (1) in communication efficiency, CRS-FL performs better than the existing methods in metric accuracy per transmission byte without model accuracy reduction in more than 7% sampling ratio (# sampling size / # model size); (2) in privacy-preserving, CRS-FL achieves no accuracy reduction compared with LDP baselines while holding the efficiency, even exceeding them in model accuracy under more sampling ratio conditions.
翻译:联邦学习(FL)作为一种隐私导向的分布式机器学习范式,因其保护参与者数据隐私的能力而在物联网领域受到广泛关注。已有研究致力于解决标准FL中存在的通信效率与隐私保护等挑战,但尚未能在保证隐私的前提下实现通信效率与模型精度之间的平衡。本文提出条件随机采样方法并将其嵌入标准FL框架(CRS-FL),以应对上述挑战。CRS基于泊松采样探索随机系数,实现更高概率获得无偏零梯度,进而在不降低模型精度的条件下有效降低通信开销。此外,我们从理论上推导了CRS的松弛本地差分隐私保证条件。大量实验结果表明:(1) 在通信效率方面,当采样率(#采样量/#模型尺寸)超过7%时,CRS-FL在不降低模型精度的前提下,其单位传输字节的精度指标优于现有方法;(2) 在隐私保护方面,CRS-FL在保持通信效率的同时,与LDP基线方法相比未出现精度损失,且在更高采样率条件下模型精度甚至超越对比方法。