We consider a symmetric mixture of linear regressions with random samples from the pairwise comparison design, which can be seen as a noisy version of a type of Euclidean distance geometry problem. We analyze the expectation-maximization (EM) algorithm locally around the ground truth and establish that the sequence converges linearly, providing an $\ell_\infty$-norm guarantee on the estimation error of the iterates. Furthermore, we show that the limit of the EM sequence achieves the sharp rate of estimation in the $\ell_2$-norm, matching the information-theoretically optimal constant. We also argue through simulation that convergence from a random initialization is much more delicate in this setting, and does not appear to occur in general. Our results show that the EM algorithm can exhibit several unique behaviors when the covariate distribution is suitably structured.
翻译:我们考虑具有成对比较设计随机样本的对称线性回归混合模型,这可以看作是一种带噪声的欧几里得距离几何问题。我们在真实参数附近局部分析了期望最大化(EM)算法,并证明了该序列线性收敛,为迭代的估计误差提供了$\ell_\infty$范数保证。此外,我们表明EM序列的极限在$\ell_2$范数下达到了估计的锐利速率,与信息论最优常数匹配。我们还通过模拟论证,在该设置下,从随机初始化开始的收敛要复杂得多,且通常不会发生。我们的结果表明,当协变量分布具有适当结构时,EM算法可以展现出若干独特行为。