We systematically study several network-based Expectation-Maximization (EM) algorithms for the Gaussian mixture model within decentralized federated learning (DFL). Our theoretical investigation shows that directly extending the classic EM algorithm to DFL leads to a biased estimator when data are heterogeneously distributed across sites. To address this, we introduce a momentum network EM (MNEM) algorithm, which integrates information from both current and historical estimators from previous DFL iterations. We further develop a semi-supervised MNEM (semi-MNEM) algorithm, which utilizes information provided by partially labeled data. Rigorous theoretical analysis demonstrates that the MNEM estimator can achieve the same asymptotic efficiency as the whole-sample estimator under appropriate regularity conditions, even with heterogeneous data. Moreover, the semi-MNEM estimator significantly improves the convergence speed of the MNEM algorithm, even if different mixture components are poorly separated. Extensive simulations are conducted, and a widely used chest X-ray dataset is analyzed to demonstrate the finite-sample performance of the proposed methods.
翻译:[中文摘要] 本文系统研究了去中心化联邦学习(DFL)框架下高斯混合模型的多种基于网络的期望最大化(EM)算法。理论分析表明,将经典EM算法直接扩展至DFL会导致数据异构分布时出现有偏估计量。为此,我们提出动量网络EM(MNEM)算法,通过融合当前与历史DFL迭代中的估计量信息来缓解偏差。进一步,我们开发了半监督MNEM(semi-MNEM)算法,该算法能够利用部分标注数据的有效信息。严格的理论分析证明,在适当的正则条件下,即使面对异构数据,MNEM估计量也能达到与全局样本估计量相同的渐近效率。此外,即使不同混合分量间存在较差的分离度,semi-MNEM估计量也能显著提升MNEM算法的收敛速度。通过大量仿真实验及对广泛使用的胸部X光数据集的分析,验证了所提方法的有限样本性能。