Federated Learning (FL) is commonly used in systems with distributed and heterogeneous devices with access to varying amounts of data and diverse computing and storage capacities. FL training process enables such devices to update the weights of a shared model locally using their local data and then a trusted central server combines all of those models to generate a global model. In this way, a global model is generated while the data remains local to devices to preserve privacy. However, training large models such as Deep Neural Networks (DNNs) on resource-constrained devices can take a prohibitively long time and consume a large amount of energy. In the current process, the low-capacity devices are excluded from the training process, although they might have access to unseen data. To overcome this challenge, we propose a model compression approach that enables heterogeneous devices with varying computing capacities to participate in the FL process. In our approach, the server shares a dense model with all devices to train it: Afterwards, the trained model is gradually compressed to obtain submodels with varying levels of sparsity to be used as suitable initial global models for resource-constrained devices that were not capable of train the first dense model. This results in an increased participation rate of resource-constrained devices while the transferred weights from the previous round of training are preserved. Our validation experiments show that despite reaching about 50 per cent global sparsity, generated submodels maintain their accuracy while can be shared to increase participation by around 50 per cent.
翻译:联邦学习(FL)通常应用于分布式异构设备系统中,这些设备可访问的数据量不同,且具有多样化的计算与存储能力。FL训练过程使得此类设备能够利用本地数据在本地更新共享模型的权重,随后由可信中央服务器整合所有模型以生成全局模型。通过这种方式,在数据保留于设备本地以保护隐私的同时,生成了全局模型。然而,在资源受限的设备上训练深度神经网络(DNNs)等大型模型可能耗时过长且消耗大量能源。在当前流程中,低容量设备尽管可能接触未见过的新数据,却被排除在训练过程之外。为克服这一挑战,我们提出一种模型压缩方法,使得具有不同计算能力的异构设备能够参与FL过程。在我们的方法中,服务器向所有设备共享一个稠密模型进行训练;随后,通过逐步压缩训练后的模型,获得具有不同稀疏度水平的子模型,作为适合那些无法训练初始稠密模型的资源受限设备的初始全局模型。这提高了资源受限设备的参与率,同时保留了前一轮训练传递的权重。我们的验证实验表明,尽管达到约50%的全局稀疏度,生成的子模型仍能保持其准确性,同时可通过共享使参与率提升约50%。