This work presents a new method for enhancing communication efficiency in stochastic Federated Learning that trains over-parameterized random networks. In this setting, a binary mask is optimized instead of the model weights, which are kept fixed. The mask characterizes a sparse sub-network that is able to generalize as good as a smaller target network. Importantly, sparse binary masks are exchanged rather than the floating point weights in traditional federated learning, reducing communication cost to at most 1 bit per parameter (Bpp). We show that previous state of the art stochastic methods fail to find sparse networks that can reduce the communication and storage overhead using consistent loss objectives. To address this, we propose adding a regularization term to local objectives that acts as a proxy of the transmitted masks entropy, therefore encouraging sparser solutions by eliminating redundant features across sub-networks. Extensive empirical experiments demonstrate significant improvements in communication and memory efficiency of up to five magnitudes compared to the literature, with minimal performance degradation in validation accuracy in some instances
翻译:本文提出了一种新方法,用于增强随机联邦学习中的通信效率,该方法训练过参数化随机网络。在该设置中,对二值掩码进行优化而非保持固定的模型权重。该掩码表征一个稀疏子网络,其泛化能力与更小的目标网络相当。重要的是,交换的是稀疏二值掩码而非传统联邦学习中的浮点权重,从而将通信成本降低至每参数最多1比特(Bpp)。本文表明,先前最先进的随机方法无法找到能够利用一致损失目标降低通信与存储开销的稀疏网络。为解决此问题,我们提出在局部目标中添加一个正则化项,该项作为传输掩码熵的代理,从而通过消除子网络间的冗余特征来促进更稀疏解的产生。大量实证实验表明,与现有文献相比,通信与内存效率实现了高达五个数量级的显著提升,且在部分实例中验证精度仅出现微小退化。