Non-independent and identically distributed (Non- IID) data adversely affects federated learning (FL) while heterogeneity in communication quality can undermine the reliability of model parameter transmission, potentially degrading wireless FL convergence. This paper proposes a novel dual-segment clustering (DSC) strategy that jointly addresses communication and data heterogeneity in FL. This is achieved by defining a new signal-to-noise ratio (SNR) matrix and information quantity matrix to capture the communication and data heterogeneity, respectively. The celebrated affinity propagation algorithm is leveraged to iteratively refine the clustering of clients based on the newly defined matrices effectively enhancing model aggregation in heterogeneous environments. The convergence analysis and experimental results show that the DSC strategy can improve the convergence rate of wireless FL and demonstrate superior accuracy in heterogeneous environments compared to classical clustering methods.
翻译:非独立同分布数据对联邦学习产生不利影响,而通信质量的异构性会损害模型参数传输的可靠性,可能导致无线联邦学习收敛性能下降。本文提出一种新颖的双段聚类策略,旨在联合解决联邦学习中的通信与数据异构性问题。该策略通过定义新的信噪比矩阵与信息量矩阵,分别刻画通信异构性与数据异构性。利用经典的近邻传播算法,基于新定义的矩阵迭代优化客户端聚类,从而有效提升异构环境下的模型聚合性能。收敛性分析与实验结果表明,与传统聚类方法相比,双段聚类策略能够提升无线联邦学习的收敛速度,并在异构环境中展现出更优的准确率。