Mixture models serve as one fundamental tool with versatile applications. However, their training techniques, like the popular Expectation Maximization (EM) algorithm, are notoriously sensitive to parameter initialization and often suffer from bad local optima that could be arbitrarily worse than the optimal. To address the long-lasting bad-local-optima challenge, we draw inspiration from the recent ground-breaking foundation models and propose to leverage their underlying big learning principle to upgrade the EM. Specifically, we present the Big Learning EM (BigLearn-EM), an EM upgrade that simultaneously performs joint, marginal, and orthogonally transformed marginal matchings between data and model distributions. Through simulated experiments, we empirically show that the BigLearn-EM is capable of delivering the optimal with high probability; comparisons on benchmark clustering datasets further demonstrate its effectiveness and advantages over existing techniques. The code is available at https://github.com/YulaiCong/Big-Learning-Expectation-Maximization.
翻译:混合模型作为一种基础工具广泛应用于各个领域。然而,其训练技术——如广为人知的期望最大化(EM)算法——对参数初始化极度敏感,且常陷入可能任意劣于最优解的糟糕局部极值。为应对这一长期存在的局部最优挑战,我们借鉴近年突破性的基础模型,提出利用其背后的"大学习"原理对EM进行升级。具体而言,我们提出大学习期望最大化(BigLearn-EM)方法,这是一种同时实现数据分布与模型分布之间的联合匹配、边缘匹配及正交变换边缘匹配的EM升级版本。通过仿真实验,我们实证表明BigLearn-EM能够以高概率获得全局最优解;在基准聚类数据集上的对比进一步验证了其相较于现有技术的有效性与优越性。代码已开源至https://github.com/YulaiCong/Big-Learning-Expectation-Maximization。