Federated learning (FL) in multidevice environments creates new opportunities to learn from a vast and diverse amount of private data. Although personal devices capture valuable data, their memory, computing, connectivity, and battery resources are often limited. Since deep neural networks (DNNs) are the typical machine learning models employed in FL, there are demands for integrating ubiquitous constrained devices into the training process of DNNs. In this paper, we develop an FL framework to incorporate on-device data selection on such constrained devices, which allows partition-based training of a DNN through collaboration between constrained devices and resourceful devices of the same client. Evaluations on five benchmark DNNs and six benchmark datasets across different modalities show that, on average, our framework achieves ~19% higher accuracy and ~58% lower latency; compared to the baseline FL without our implemented strategies. We demonstrate the effectiveness of our FL framework when dealing with imbalanced data, client participation heterogeneity, and various mobility patterns. As a benchmark for the community, our code is available at https://github.com/dr-bell/data-centric-federated-learning
翻译:联邦学习(FL)在多设备环境中为从海量且多样化的私有数据中学习创造了新机遇。尽管个人设备能捕获有价值的数据,但其内存、计算、连接和电池资源通常有限。由于深度神经网络(DNNs)是FL中常用的机器学习模型,因此需要将资源受限的泛在设备集成到DNN的训练过程中。本文开发了一个FL框架,在资源受限设备上实现设备端数据选择,使同一客户端的受限设备与资源丰富设备通过分区协作训练DNN。在跨不同模态的五个基准DNN模型和六个基准数据集上的评估表明,与未采用本策略的基准FL相比,我们的框架平均实现了约19%的更高准确率和约58%的更低延迟。我们证明了该FL框架在处理数据不平衡、客户端参与异构性以及多种移动模式时的有效性。作为社区基准,我们的代码可在 https://github.com/dr-bell/data-centric-federated-learning 获取。