Deploying FL using IoT devices is an area poised to significantly benefit from advances in NextG wireless. In this paper, we deploy a FL application using a 5G-NR Standalone (SA) testbed with open-source and Commercial Off-the-Shelf (COTS) components. The 5G testbed architecture consists of a network of resource-constrained edge devices, namely Raspberry Pis, and a central server equipped with a Software Defined Radio (SDR) and running O-RAN software. Our testbed allows edge devices to communicate with the server using WiFi and Ethernet in addition to 5G. FL is deployed using the Flower FL framework, extended with custom instrumentation for communication and ML metrics. We analyze the FL application across three network interfaces--5G, WiFi, and Ethernet--as well as across 5G bandwidths and uplink-downlink scheduling ratios. Our experimental results challenge some common assumptions about communication time in FL over wireless and discuss the potential pitfalls of these assumptions. We find that there is a consistent straggler in about 70% of trials, while in the other 30%, high communication time causes competing stragglers. We also compare FL performance over 5G with and without external congestion and compare our testbed to commercial 5G to validate our findings in a broader context. For reproducibility, we have open-sourced our FL application, instrumentation tools, and testbed configuration.
翻译:利用物联网设备部署联邦学习是一个有望从下一代无线技术(NextG)的进步中显著受益的领域。本文采用5G新空口独立组网(5G-NR SA)测试平台,该平台由开源和商用现成(COTS)组件构成,部署了一个联邦学习应用。5G测试平台架构包含一组资源受限的边缘设备(即树莓派)以及一台配备软件定义无线电(SDR)并运行O-RAN软件的中央服务器。除5G外,我们的测试平台还允许边缘设备通过WiFi和以太网与服务器通信。联邦学习采用Flower FL框架部署,并扩展了自定义仪表以实现通信和机器学习指标的采集。我们通过三种网络接口——5G、WiFi和以太网——以及不同的5G带宽和上下行调度比率,对联邦学习应用进行了分析。实验结果挑战了关于无线网络中联邦学习通信时间的一些常见假设,并讨论了这些假设的潜在陷阱。我们发现,约70%的试验中始终存在一个落伍节点,而在其余30%的试验中,高通信时间导致了竞争性落伍节点的出现。我们还比较了有无外部拥塞情况下5G联邦学习的性能,并将我们的测试平台与商业5G进行了对比,以在更广泛的背景下验证我们的发现。为便于复现,我们已开源了联邦学习应用、仪表工具及测试平台配置。