In this work, we propose a communication-efficient hierarchical federated learning algorithm for distributed setups including core servers and multiple edge servers with clusters of devices. Assuming different learning tasks, clusters with a same task collaborate. To implement the algorithm over wireless links, we propose a scalable clustered over-the-air aggregation scheme for the uplink with a bandwidth-limited broadcast scheme for the downlink that requires only a single resource block for each algorithm iteration, independent of the number of edge servers and devices. This setup is faced with interference of devices in the uplink and interference of edge servers in the downlink that are to be modeled rigorously. We first develop a spatial model for the setup by modeling devices as a Poisson cluster process over the edge servers and quantify uplink and downlink error terms due to the interference. Accordingly, we present a comprehensive mathematical approach to derive the convergence bound for the proposed algorithm including any number of collaborating clusters and provide special cases and design remarks. Finally, we show that despite the interference and data heterogeneity, the proposed algorithm not only achieves high learning accuracy for a variety of parameters but also significantly outperforms the conventional hierarchical learning algorithm.
翻译:本文提出了一种面向分布式场景(包括核心服务器和多个边缘服务器及其设备集群)的通信高效分层联邦学习算法。针对不同的学习任务,拥有相同任务的集群进行协作。为实现该算法在无线链路上的运行,我们提出了一种可扩展的集群空中聚合方案用于上行链路,以及一种带宽受限的广播方案用于下行链路,该方案每次算法迭代仅需单个资源块,且与边缘服务器和设备数量无关。该场景面临上行链路设备间干扰和下行链路边缘服务器间干扰的问题,需进行严格建模。我们首先通过将设备建模为边缘服务器上的泊松簇过程来建立该场景的空间模型,并量化由干扰导致的上行和下行误差项。据此,我们提出了一种全面的数学方法,推导了包含任意数量协作集群的所提算法的收敛界,并提供了特例与设计启示。最后,我们证明:尽管存在干扰和数据异质性,所提算法不仅能在多种参数下获得高学习精度,而且显著优于传统分层学习算法。