Federated learning (FL) combined with differential privacy (DP) offers machine learning (ML) training with distributed devices and with a formal privacy guarantee. With a large population of devices, FL with DP produces a performant model in a timely manner. However, for applications with a smaller population, not only does the model utility degrade as the DP noise is inversely proportional to population, but also the training latency increases since waiting for enough clients to become available from a smaller pool is slower. In this work, we thus propose expanding the population based on domain adaptation techniques to speed up the training and improves the final model quality when training with small populations. We empirically demonstrate that our techniques can improve the utility by 13% to 30% on real-world language modeling datasets.
翻译:联邦学习(FL)结合差分隐私(DP)可在分布式设备上实现机器学习(ML)训练,并提供形式化的隐私保证。当设备种群规模较大时,结合DP的FL能以较高时效性训练出高性能模型。然而,对于种群规模较小的应用场景,不仅由于DP噪声与种群规模成反比导致模型效用下降,而且因从较小设备池中等待足够客户端可用所需时间较长,训练延迟也会增加。为此,本文提出基于领域自适应技术的种群扩展方法,以加快小规模种群训练速度并提升最终模型质量。实验结果表明,在真实语言建模数据集上,该技术可将模型效用提升13%至30%。