Federated Machine Learning (FL) has received considerable attention in recent years. FL benchmarks are predominantly explored in either simulated systems or data center environments, neglecting the setups of real-world systems, which are often closely linked to edge computing. We close this research gap by introducing FLEdge, a benchmark targeting FL workloads in edge computing systems. We systematically study hardware heterogeneity, energy efficiency during training, and the effect of various differential privacy levels on training in FL systems. To make this benchmark applicable to real-world scenarios, we evaluate the impact of client dropouts on state-of-the-art FL strategies with failure rates as high as 50%. FLEdge provides new insights, such as that training state-of-the-art FL workloads on older GPU-accelerated embedded devices is up to 3x more energy efficient than on modern server-grade GPUs.
翻译:联邦机器学习(FL)近年来受到了广泛关注。现有FL基准评测主要探索模拟系统或数据中心环境,忽视了与实际系统(通常与边缘计算紧密相关)的架构差异。为填补这一研究空白,我们提出了FLEdge——一个面向边缘计算系统中FL工作负载的基准测试框架。我们系统研究了硬件异构性、训练过程中的能效问题,以及不同差分隐私级别对FL系统训练效果的影响。为使该基准适用于实际场景,我们评估了客户丢失率高达50%时各类先进FL策略的鲁棒性。FLEdge带来了全新发现,例如在较旧的GPU加速嵌入式设备上训练先进FL工作负载,其能效比现代服务器级GPU高出3倍。