Federated Learning (FL) presents an innovative approach to privacy-preserving distributed machine learning and enables efficient crowd intelligence on a large scale. However, a significant challenge arises when coordinating FL with crowd intelligence which diverse client groups possess disparate objectives due to data heterogeneity or distinct tasks. To address this challenge, we propose the Federated cINN Clustering Algorithm (FCCA) to robustly cluster clients into different groups, avoiding mutual interference between clients with data heterogeneity, and thereby enhancing the performance of the global model. Specifically, FCCA utilizes a global encoder to transform each client's private data into multivariate Gaussian distributions. It then employs a generative model to learn encoded latent features through maximum likelihood estimation, which eases optimization and avoids mode collapse. Finally, the central server collects converged local models to approximate similarities between clients and thus partition them into distinct clusters. Extensive experimental results demonstrate FCCA's superiority over other state-of-the-art clustered federated learning algorithms, evaluated on various models and datasets. These results suggest that our approach has substantial potential to enhance the efficiency and accuracy of real-world federated learning tasks.
翻译:联邦学习(FL)提出了一种保护隐私的分布式机器学习创新方法,能够在大规模场景下高效实现群体智能。然而,当协调具有数据异质性或不同任务的多样化客户端群体的群体智能时,会产生重大挑战。为解决这一问题,我们提出联邦cINN聚类算法(FCCA),该算法能够稳健地将客户端划分为不同集群,避免具有数据异质性的客户端之间相互干扰,从而提升全局模型的性能。具体而言,FCCA利用全局编码器将每个客户端的私有数据转换为多元高斯分布,随后通过最大似然估计使用生成模型学习编码后的潜在特征,这既简化了优化过程又避免了模式坍塌。最后,中央服务器收集收敛后的本地模型以近似客户端之间的相似性,据此将其划分至不同集群。大量实验结果表明,在多种模型和数据集上的评估中,FCCA相较于其他最先进的集群联邦学习算法具有显著优势。这些结果表明,我们的方法在提升真实联邦学习任务的效率和准确性方面具有巨大潜力。