We consider a new method for estimating the parameters of univariate Gaussian mixture models. The method relies on a nonparametric density estimator $\hat{f}_n$ (typically a kernel estimator). For every set of Gaussian mixture components, $\hat{f}_n$ is used to find the best set of mixture weights. That set is obtained by minimizing the $L_2$ distance between $\hat{f}_n$ and the Gaussian mixture density with the given component parameters. The densities together with the obtained weights are then plugged in to the likelihood function, resulting in the so-called pseudo-likelihood function. The final parameter estimators are the parameter values that maximize the pseudo-likelihood function together with the corresponding weights. The advantages of the pseudo-likelihood over the full likelihood are: 1) its arguments are the means and variances only, mixture weights are also functions of the means and variances; 2) unlike the likelihood function, it is always bounded above. Thus, the maximizer of the pseudo-likelihood function -- referred to as the pseudo-likelihood estimator -- always exists. In this article, we prove that the pseudo-likelihood estimator is strongly consistent.
翻译:我们提出了一种估计单变量高斯混合模型参数的新方法。该方法依赖于非参数密度估计量 $\hat{f}_n$(通常为核估计量)。对于任意一组高斯混合分量,利用 $\hat{f}_n$ 寻找最优的混合权重组合。该权重组合通过最小化 $\hat{f}_n$ 与给定分量参数下高斯混合密度之间的 $L_2$ 距离获得。随后将密度函数与求得的权重代入似然函数,得到所谓的伪似然函数。最终参数估计值即为使伪似然函数最大化的参数值及其对应的权重。伪似然相较于完全似然的优势在于:1)其自变量仅为均值与方差,混合权重亦为均值与方差的函数;2)与似然函数不同,伪似然函数始终存在上界。因此,伪似然函数的最大化估计量——即伪似然估计量——必然存在。本文证明了该伪似然估计量具有强相合性。