Expectation Maximization (EM) Converges for General Agnostic Mixtures

Mixture of linear regression is well studied in statistics and machine learning, where the data points are generated probabilistically using $k$ linear models. Algorithms like Expectation Maximization (EM) may be used to recover the ground truth regressors for this problem. Recently, in \cite{pal2022learning,ghosh_agnostic} the mixed linear regression problem is studied in the agnostic setting, where no generative model on data is assumed. Rather, given a set of data points, the objective is \emph{fit} $k$ lines by minimizing a suitable loss function. It is shown that a modification of EM, namely gradient EM converges exponentially to appropriately defined loss minimizer even in the agnostic setting. In this paper, we study the problem of \emph{fitting} $k$ parametric functions to given set of data points. We adhere to the agnostic setup. However, instead of fitting lines equipped with quadratic loss, we consider any arbitrary parametric function fitting equipped with a strongly convex and smooth loss. This framework encompasses a large class of problems including mixed linear regression (regularized), mixed linear classifiers (mixed logistic regression, mixed Support Vector Machines) and mixed generalized linear regression. We propose and analyze gradient EM for this problem and show that with proper initialization and separation condition, the iterates of gradient EM converge exponentially to appropriately defined population loss minimizers with high probability. This shows the effectiveness of EM type algorithm which converges to \emph{optimal} solution in the non-generative setup beyond mixture of linear regression.

翻译：线性回归混合模型是统计学与机器学习中研究充分的问题，其中数据点通过$k$个线性模型概率生成。期望最大化（EM）等算法可用于恢复该问题的真实回归量。近期，在文献\cite{pal2022learning,ghosh_agnostic}中，混合线性回归问题被置于非生成（agnostic）背景下研究，即不假设数据存在生成模型。相反，给定一组数据点，目标是通过最小化适当的损失函数来\emph{拟合}$k$条直线。研究表明，即使在该非生成背景下，EM的改进版本——梯度EM，也能以指数速度收敛至适当定义的损失极小化器。本文研究对给定数据集进行$k$个参数函数\emph{拟合}的问题。我们沿用非生成框架，但不再局限于采用二次损失拟合直线，而是考虑任意参数函数拟合，并配备强凸且光滑的损失函数。该框架涵盖大量问题，包括（正则化）混合线性回归、混合线性分类器（混合逻辑回归、混合支持向量机）及混合广义线性回归。我们针对该问题提出并分析梯度EM方法，证明在适当初始化与分离条件下，梯度EM的迭代值能够以高概率指数收敛至适当定义的总体损失极小化器。这表明在超出线性回归混合模型的非生成场景中，EM类算法仍能有效收敛至\emph{最优}解。