As deep learning models become increasingly large, they pose significant challenges in heterogeneous devices environments. The size of deep learning models makes it difficult to deploy them on low-power or resource-constrained devices, leading to long inference times and high energy consumption. To address these challenges, we propose FlexTrain, a framework that accommodates the diverse storage and computational resources available on different devices during the training phase. FlexTrain enables efficient deployment of deep learning models, while respecting device constraints, minimizing communication costs, and ensuring seamless integration with diverse devices. We demonstrate the effectiveness of FlexTrain on the CIFAR-100 dataset, where a single global model trained with FlexTrain can be easily deployed on heterogeneous devices, saving training time and energy consumption. We also extend FlexTrain to the federated learning setting, showing that our approach outperforms standard federated learning benchmarks on both CIFAR-10 and CIFAR-100 datasets.
翻译:随着深度学习模型的规模日益增大,它们在异构设备环境中面临严峻挑战。模型体积庞大导致难以部署在低功耗或资源受限的设备上,从而产生较长的推理时间和高能耗。针对这些问题,我们提出FlexTrain——一种能够在训练阶段适配不同设备上多样化存储与计算资源的框架。FlexTrain在遵守设备约束、最小化通信开销并确保与各类设备无缝集成的条件下,实现了深度学习模型的高效部署。我们在CIFAR-100数据集上验证了FlexTrain的有效性:通过FlexTrain训练的单一全局模型可轻松部署在异构设备上,显著节省训练时间与能耗。此外,我们将FlexTrain扩展至联邦学习场景,结果表明,在CIFAR-10和CIFAR-100数据集上,我们的方法均超越标准联邦学习基准。