Federated Learning (FL), a privacy-oriented distributed ML paradigm, is being gaining great interest in Internet of Things because of its capability to protect participants data privacy. Studies have been conducted to address challenges existing in standard FL, including communication efficiency and privacy-preserving. But they cannot achieve the goal of making a tradeoff between communication efficiency and model accuracy while guaranteeing privacy. This paper proposes a Conditional Random Sampling (CRS) method and implements it into the standard FL settings (CRS-FL) to tackle the above-mentioned challenges. CRS explores a stochastic coefficient based on Poisson sampling to achieve a higher probability of obtaining zero-gradient unbiasedly, and then decreases the communication overhead effectively without model accuracy degradation. Moreover, we dig out the relaxation Local Differential Privacy (LDP) guarantee conditions of CRS theoretically. Extensive experiment results indicate that (1) in communication efficiency, CRS-FL performs better than the existing methods in metric accuracy per transmission byte without model accuracy reduction in more than 7% sampling ratio (# sampling size / # model size); (2) in privacy-preserving, CRS-FL achieves no accuracy reduction compared with LDP baselines while holding the efficiency, even exceeding them in model accuracy under more sampling ratio conditions.
翻译:联邦学习(FL)作为一种面向隐私的分布式机器学习范式,因其能够保护参与者的数据隐私,正受到物联网领域的广泛关注。已有研究致力于解决标准FL中存在的通信效率与隐私保护等挑战,但无法在保证隐私的前提下,实现通信效率与模型精度之间的权衡。本文提出了一种条件随机抽样(CRS)方法,并将其嵌入标准FL框架(CRS-FL),以应对上述挑战。CRS基于泊松抽样探索随机系数,从而以更高概率实现无偏零梯度获取,进而在不降低模型精度的前提下有效减少通信开销。此外,我们从理论上推导了CRS的松弛本地差分隐私(LDP)保证条件。大量实验结果表明:(1)在通信效率方面,当采样率(采样数/模型参数数)超过7%时,CRS-FL在每传输字节的度量精度上优于现有方法,且未降低模型精度;(2)在隐私保护方面,CRS-FL在与LDP基线方法保持同等效率的前提下实现了零精度损失,在更高采样率条件下其模型精度甚至超越了基线方法。