This study investigates clustered federated learning (FL), one of the formulations of FL with non-i.i.d. data, where the devices are partitioned into clusters and each cluster optimally fits its data with a localized model. We propose a clustered FL framework that incorporates a nonconvex penalty to pairwise differences of parameters. Without a priori knowledge of the set of devices in each cluster and the number of clusters, this framework can autonomously estimate cluster structures. To implement the proposed framework, we introduce a novel clustered FL method called Fusion Penalized Federated Clustering (FPFC). Building upon the standard alternating direction method of multipliers (ADMM), FPFC can perform partial updates at each communication round and allows parallel computation with variable workload. These strategies significantly reduce the communication cost while ensuring privacy, making it practical for FL. We also propose a new warmup strategy for hyperparameter tuning in FL settings and explore the asynchronous variant of FPFC (asyncFPFC). Theoretical analysis provides convergence guarantees for FPFC with general losses and establishes the statistical convergence rate under a linear model with squared loss. Extensive experiments have demonstrated the superiority of FPFC compared to current methods, including robustness and generalization capability.
翻译:本研究探讨了聚类联邦学习(FL),这是处理非独立同分布数据的联邦学习形式之一,其中设备被划分为多个簇,每个簇使用局部化模型最优地拟合其数据。我们提出了一种聚类联邦学习框架,该框架在参数的成对差异中引入非凸惩罚项。无需预先知晓每个簇中的设备集合及簇的数量,此框架能够自主估计簇结构。为实现所提出的框架,我们引入了一种名为融合惩罚联邦聚类(FPFC)的新型聚类联邦学习方法。基于标准交替方向乘子法(ADMM),FPFC可在每轮通信中进行部分更新,并支持可变工作负载的并行计算。这些策略在确保隐私的同时显著降低了通信成本,使其适用于联邦学习。我们还提出了一种针对联邦学习场景中超参数调优的新型预热策略,并探索了FPFC的异步变体(asyncFPFC)。理论分析为一般损失函数下的FPFC提供了收敛保证,并在线性模型与平方损失下建立了统计收敛速率。大量实验证明了FPFC相较于现有方法的优越性,包括鲁棒性和泛化能力。