Federated Learning (FL) is a promising machine learning approach for Internet of Things (IoT), but it has to address network congestion problems when the population of IoT devices grows. Hierarchical FL (HFL) alleviates this issue by distributing model aggregation to multiple edge servers. Nevertheless, the challenge of communication overhead remains, especially in scenarios where all IoT devices simultaneously join the training process. For scalability, practical HFL schemes select a subset of IoT devices to participate in the training, hence the notion of device scheduling. In this setting, only selected IoT devices are scheduled to participate in the global training, with each of them being assigned to one edge server. Existing HFL assignment methods are primarily based on search mechanisms, which suffer from high latency in finding the optimal assignment. This paper proposes an improved K-Center algorithm for device scheduling and introduces a deep reinforcement learning-based approach for assigning IoT devices to edge servers. Experiments show that scheduling 50% of IoT devices is generally adequate for achieving convergence in HFL with much lower time delay and energy consumption. In cases where reduction in energy consumption (such as in Green AI) and reduction of messages (to avoid burst traffic) are key objectives, scheduling 30% IoT devices allows a substantial reduction in energy and messages with similar model accuracy.
翻译:联邦学习(FL)是物联网(IoT)领域一种极具前景的机器学习方法,但随着物联网设备数量的增长,它必须应对网络拥塞问题。分层联邦学习(HFL)通过将模型聚合分布到多个边缘服务器来缓解这一问题。然而,通信开销的挑战依然存在,尤其是在所有物联网设备同时参与训练的场景中。为提升可扩展性,实用的HFL方案会选择一部分物联网设备参与训练,由此引入设备调度的概念。在此设定下,仅被选中的物联网设备参与全局训练,每台设备被分配给一个边缘服务器。现有的HFL分配方法主要基于搜索机制,这类方法在寻找最优分配时存在高延迟问题。本文提出了一种改进的K-中心算法用于设备调度,并引入了一种基于深度强化学习的方法将物联网设备分配给边缘服务器。实验表明,调度50%的物联网设备通常足以在HFL中实现收敛,同时大幅降低时间延迟和能耗。当降低能耗(如绿色人工智能)和减少消息量(以避免突发流量)成为关键目标时,调度30%的物联网设备可在保持相近模型精度的同时,显著降低能耗和消息量。