Hierarchical Federated Learning (HFL) extends conventional Federated Learning (FL) by introducing intermediate aggregation layers, enabling distributed learning in geographically dispersed environments, particularly relevant for smart IoT systems, such as remote monitoring and battlefield operations, where cellular connectivity is limited. In these scenarios, UAVs serve as mobile aggregators, dynamically connecting terrestrial IoT devices. This paper investigates an HFL architecture with energy-constrained, dynamically deployed UAVs prone to communication disruptions. We propose a novel approach to minimize global training costs by formulating a joint optimization problem that integrates learning configuration, bandwidth allocation, and device-to-UAV association, ensuring timely global aggregation before UAV disconnections and redeployments. The problem accounts for dynamic IoT devices and intermittent UAV connectivity and is NP-hard. To tackle this, we decompose it into three subproblems: \textit{(i)} optimizing learning configuration and bandwidth allocation via an augmented Lagrangian to reduce training costs; \textit{(ii)} introducing a device fitness score based on data heterogeneity (via Kullback-Leibler divergence), device-to-UAV proximity, and computational resources, using a TD3-based algorithm for adaptive device-to-UAV assignment; \textit{(iii)} developing a low-complexity two-stage greedy strategy for UAV redeployment and global aggregator selection, ensuring efficient aggregation despite UAV disconnections. Experiments on diverse real-world datasets validate the approach, demonstrating cost reduction and robust performance under communication disruptions.
翻译:分层联邦学习(HFL)通过引入中间聚合层扩展了传统联邦学习(FL),使其适用于地理分散环境中的分布式学习,尤其适用于蜂窝连接受限的智能物联网系统,如远程监控与战场作战。在此类场景中,无人机作为移动聚合器,动态连接地面物联网设备。本文研究一种由能量受限、动态部署且易受通信中断影响的无人机所构成的HFL架构。我们提出一种新颖方法,通过构建一个联合优化问题来最小化全局训练成本,该问题整合了学习配置、带宽分配与设备-无人机关联,确保在无人机断开连接与重新部署前完成及时的全局聚合。该问题考虑动态物联网设备与间歇性无人机连接,属于NP难问题。为解决此问题,我们将其分解为三个子问题:\textit{(i)} 通过增广拉格朗日法优化学习配置与带宽分配以降低训练成本;\textit{(ii)} 基于数据异质性(通过Kullback-Leibler散度度量)、设备-无人机邻近度与计算资源引入设备适应度评分,并采用基于TD3的算法实现自适应设备-无人机分配;\textit{(iii)} 设计低复杂度的两阶段贪心策略用于无人机重新部署与全局聚合器选择,确保在无人机断开连接时仍能实现高效聚合。基于多种真实数据集的实验验证了该方法的有效性,证明了其在通信中断下能够降低系统成本并保持鲁棒性能。