Efficiently running federated learning (FL) on resource-constrained devices is challenging since they are required to train computationally intensive deep neural networks (DNN) independently. DNN partitioning-based FL (DPFL) has been proposed as one mechanism to accelerate training where the layers of a DNN (or computation) are offloaded from the device to an edge server. However, this creates significant communication overheads since the activation and gradient need to be transferred between the device and the edge server during training. Current techniques reduce the communication introduced by DNN partitioning using local loss-based methods. We demonstrate that these methods adversely impact accuracy and ignore the communication costs incurred when transmitting the activation from the device to the server. This paper proposes ActionFed - a communication efficient framework for DPFL to accelerate training on resource-constrained devices. ActionFed eliminates the transmission of the gradient by developing pre-trained initialization of the DNN model on the device for the first time. This reduces the accuracy degradation seen in local loss-based methods. In addition, ActionFed proposes a novel replay buffer mechanism and implements a quantization-based compression technique to reduce the transmission of the activation. It is experimentally demonstrated that ActionFed can reduce the communication cost by up to 15.77x and accelerates training by up to 3.87x when compared to vanilla DPFL.
翻译:在资源受限设备上高效运行联邦学习(FL)具有挑战性,因为这些设备需要独立训练计算密集型的深度神经网络(DNN)。基于DNN分区的联邦学习(DPFL)被提出作为一种加速训练的机制,其中DNN的层(或计算)从设备卸载到边缘服务器。然而,这在训练过程中带来了显著的通信开销,因为激活值和梯度需要在设备和边缘服务器之间传输。当前技术使用基于局部损失的方法来减少DNN分区引入的通信。我们证明这些方法会对准确性产生不利影响,并忽略了从设备向服务器传输激活值时产生的通信成本。本文提出ActionFed——一种面向DPFL的高效通信框架,旨在加速资源受限设备上的训练。ActionFed通过首次在设备上开发DNN模型的预训练初始化,消除了梯度的传输。这减少了基于局部损失方法中出现的准确性下降。此外,ActionFed提出了一种新颖的重放缓冲区机制,并实现了基于量化的压缩技术来减少激活值的传输。实验结果表明,与原始DPFL相比,ActionFed可将通信成本降低高达15.77倍,并将训练速度提升高达3.87倍。