This work presents a new method for enhancing communication efficiency in stochastic Federated Learning that trains over-parameterized random networks. In this setting, a binary mask is optimized instead of the model weights, which are kept fixed. The mask characterizes a sparse sub-network that is able to generalize as good as a smaller target network. Importantly, sparse binary masks are exchanged rather than the floating point weights in traditional federated learning, reducing communication cost to at most 1 bit per parameter. We show that previous state of the art stochastic methods fail to find the sparse networks that can reduce the communication and storage overhead using consistent loss objectives. To address this, we propose adding a regularization term to local objectives that encourages sparser solutions by eliminating redundant features across sub-networks. Extensive experiments demonstrate significant improvements in communication and memory efficiency of up to five magnitudes compared to the literature, with minimal performance degradation in validation accuracy in some instances.
翻译:本文提出了一种新方法,用于提高训练过参数化随机网络的随机联邦学习中的通信效率。在该设定下,优化的是二元掩码而非模型权重(模型权重保持固定)。该掩码描述了一个稀疏子网络,该子网络能够与小规模目标网络一样具备良好的泛化能力。关键的是,传统联邦学习中交换的是浮点权重,而此处交换的是稀疏二元掩码,从而将每个参数的通信成本降低至最多1比特。我们表明,现有最先进的随机方法无法通过一致的损失目标找到能够降低通信和存储开销的稀疏网络。为解决这一问题,我们提出在局部目标函数中添加正则化项,通过消除子网络间的冗余特征来鼓励更稀疏的解。大量实验表明,与现有文献相比,通信和存储效率提升高达五个数量级,且在部分情况下验证准确率的性能退化极小。