This work introduces a new method for selecting the number of components in finite mixture models (FMMs) using variational Bayes, inspired by the large-sample properties of the Evidence Lower Bound (ELBO) derived from mean-field (MF) variational approximation. Specifically, we establish matching upper and lower bounds for the ELBO without assuming conjugate priors, suggesting the consistency of model selection for FMMs based on maximizing the ELBO. As a by-product of our proof, we demonstrate that the MF approximation inherits the stable behavior (benefited from model singularity) of the posterior distribution, which tends to eliminate the extra components under model misspecification where the number of mixture components is over-specified. This stable behavior also leads to the $n^{-1/2}$ convergence rate for parameter estimation, up to a logarithmic factor, under this model overspecification. Empirical experiments are conducted to validate our theoretical findings and compare with other state-of-the-art methods for selecting the number of components in FMMs.
翻译:本文提出了一种利用变分贝叶斯方法选择有限混合模型(FMMs)中分量数的新方法,该方法受平均场(MF)变分近似导出的证据下界(ELBO)的大样本性质启发。具体而言,我们在不假设共轭先验的情况下,为ELBO建立了匹配的上下界,表明基于最大化ELBO的FMM模型选择具有一致性。作为我们证明的副产品,我们证明了MF近似继承了后验分布的稳定行为(得益于模型奇异性),这种稳定行为倾向于在混合分量数被过度指定的模型误设情况下消除多余的分量。这种稳定行为还导致在模型过度指定下,参数估计达到$n^{-1/2}$收敛速度(至多相差一个对数因子)。我们进行了实证实验以验证理论发现,并与当前最先进的FMM分量数选择方法进行了比较。