This work introduces a new method for selecting the number of components in finite mixture models (FMMs) using variational Bayes, inspired by the large-sample properties of the Evidence Lower Bound (ELBO) derived from mean-field (MF) variational approximation. Specifically, we establish matching upper and lower bounds for the ELBO without assuming conjugate priors, suggesting the consistency of model selection for FMMs based on maximizing the ELBO. As a by-product of our proof, we demonstrate that the MF approximation inherits the stable behavior (benefited from model singularity) of the posterior distribution, which tends to eliminate the extra components under model misspecification where the number of mixture components is over-specified. This stable behavior also leads to the $n^{-1/2}$ convergence rate for parameter estimation, up to a logarithmic factor, under this model overspecification. Empirical experiments are conducted to validate our theoretical findings and compare with other state-of-the-art methods for selecting the number of components in FMMs.
翻译:本文提出了一种利用变分贝叶斯选择有限混合模型(FMMs)分量数的新方法,其灵感来源于平均场(MF)变分逼近导出的证据下界(ELBO)的大样本性质。具体而言,我们在不假设共轭先验的情况下,建立了ELBO的匹配上下界,表明基于最大化ELBO的FMMs模型选择具有相合性。作为证明的副产品,我们证明了在模型误设定(即混合分量数被过度指定)的情况下,MF逼近继承了后验分布的稳定行为(得益于模型奇异性),该行为倾向于消除多余的分量。在这种过度指定条件下,参数估计的收敛速率可达$n^{-1/2}$(至多相差一个对数因子)。我们通过实证实验验证了理论结果,并与当前最先进的FMMs分量数选择方法进行了比较。