Although Hierarchical Federated Learning (HFL) utilizes edge servers (ESs) to alleviate communication burdens, its model performance will be degraded by non-IID data and limited communication resources. Current works often assume that data is uniformly distributed, which however contradicts the heterogeneity of IoT. Solutions of additional model training to check the data distribution inevitably increases computational costs and the risk of privacy leakage. The challenges in solving these issues are how to reduce the impact of non-IID data without involving raw data and how to rationalize the communication resource allocation for addressing straggler problem. To tackle these challenges, we propose a novel optimization method based on coaLition formation gamE and grAdient Projection, called LEAP. Specifically, we combine edge data distribution with coalition formation game innovatively to adjust the correlations between clients and ESs dynamically, which ensures optimal correlations. We further capture the client heterogeneity to achieve the rational bandwidth allocation from coalition perception and determine the optimal transmission power within specified delay constraints at client level. Experimental results on four real datasets show that LEAP is able to achieve 20.62% improvement in model accuracy compared to the state-of-the-art baselines. Moreover, LEAP effectively reduce transmission energy consumption by at least about 2.24 times.
翻译:尽管分层联邦学习利用边缘服务器减轻了通信负担,但其模型性能会因非独立同分布数据和有限的通信资源而下降。现有工作通常假设数据均匀分布,但这与物联网的异构性相矛盾。通过额外模型训练来检查数据分布的解决方案不可避免地增加了计算成本和隐私泄露风险。解决这些问题的挑战在于:如何在不涉及原始数据的情况下减少非独立同分布数据的影响,以及如何合理化通信资源分配以解决滞后节点问题。为应对这些挑战,我们提出了一种基于联盟形成博弈和梯度投影的新型优化方法,称为LEAP。具体而言,我们创新性地将边缘数据分布与联盟形成博弈相结合,动态调整客户端与边缘服务器之间的关联关系,确保最优关联性。我们进一步捕捉客户端异构性,从联盟感知角度实现合理的带宽分配,并在客户端层面确定指定延迟约束内的最优传输功率。在四个真实数据集上的实验结果表明,与最先进的基线方法相比,LEAP能够将模型准确率提升20.62%。此外,LEAP有效降低了传输能耗,至少降低了约2.24倍。