While supervised federated learning approaches have enjoyed significant success, the domain of unsupervised federated learning remains relatively underexplored. Several federated EM algorithms have gained popularity in practice, however, their theoretical foundations are often lacking. In this paper, we first introduce a federated gradient EM algorithm (FedGrEM) designed for the unsupervised learning of mixture models, which supplements the existing federated EM algorithms by considering task heterogeneity and potential adversarial attacks. We present a comprehensive finite-sample theory that holds for general mixture models, then apply this general theory on specific statistical models to characterize the explicit estimation error of model parameters and mixture proportions. Our theory elucidates when and how FedGrEM outperforms local single-task learning with insights extending to existing federated EM algorithms. This bridges the gap between their practical success and theoretical understanding. Our simulation results validate our theory, and demonstrate FedGrEM's superiority over existing unsupervised federated learning benchmarks.
翻译:尽管有监督联邦学习方法已取得显著成功,但无监督联邦学习领域仍相对未被充分探索。多种联邦EM算法在实践中广受欢迎,但其理论基础往往欠缺。本文首先提出一种用于混合模型无监督学习的联邦梯度EM算法(FedGrEM),该算法通过考虑任务异质性和潜在对抗攻击,补充了现有联邦EM算法。我们给出了适用于一般混合模型的全面有限样本理论,进而将此通用理论应用于特定统计模型,以刻画模型参数和混合比例显式估计误差。我们的理论阐明了FedGrEM何时及为何优于局部单任务学习,其洞见可扩展至现有联邦EM算法,从而弥合了这些算法实际成功与理论理解之间的鸿沟。仿真结果验证了我们的理论,并证明了FedGrEM相比现有无监督联邦学习基准方法的优越性。