This paper addresses the statistical estimation of Gaussian Mixture Models (GMMs) with unknown diagonal covariances from independent and identically distributed samples. We employ the Beurling-LASSO (BLASSO), a convex optimization framework that promotes sparsity in the space of measures, to simultaneously estimate the number of components and their parameters. Our main contribution extends the BLASSO methodology to multivariate GMMs with component-specific unknown diagonal covariance matrices. This setting is significantly more flexible than previous approaches, which required known and identical covariances. We establish non-asymptotic recovery guarantees with nearly parametric convergence rates for component means, diagonal covariances, and weights, as well as for density prediction. A key theoretical contribution is the identification of an explicit separation condition on mixture components that enables the construction of non-degenerate dual certificates-essential tools for establishing statistical guarantees for the BLASSO. Our analysis leverages the Fisher-Rao geometry of the statistical model and introduces a novel semi-distance adapted to our framework, providing new insights into the interplay between component separation, parameter space geometry, and achievable statistical recovery.
翻译:本文研究从独立同分布样本中统计估计具有未知对角协方差的高斯混合模型。我们采用Beurling-LASSO这一在测度空间促进稀疏性的凸优化框架,以同时估计分量数量及其参数。我们的主要贡献是将BLASSO方法扩展到具有分量特定未知对角协方差矩阵的多元GMM。该设定显著优于先前需要已知且相同协方差的方法。我们建立了关于分量均值、对角协方差、权重以及密度预测的近乎参数收敛率的非渐近恢复保证。一个关键理论贡献是识别出混合分量的显式分离条件,该条件使得能够构建非退化对偶证书——这是建立BLASSO统计保证的重要工具。我们的分析利用统计模型的Fisher-Rao几何结构,并引入一种适用于本框架的新型半距离,为分量分离、参数空间几何与可实现的统计恢复之间的相互作用提供了新的见解。