We consider Bayesian variable selection for binary outcomes under a probit link with a spike-and-slab prior on the regression coefficients. Motivated by the computational challenges encountered by Markov chain Monte Carlo (MCMC) samplers in high-dimensional regimes, we develop a mean-field variational Bayes approximation in which all variational factors admit closed-form updates, and the evidence lower bound is available in closed form. This, in turn, allows the development of an efficient coordinate ascent variational inference algorithm to find the optimal values of the variational parameters. The approach produces posterior inclusion probabilities and parameter estimates, enabling interpretable selection and prediction within a single framework. As shown in both simulated and real data applications, the proposed method successfully identifies the important variables and is orders of magnitude faster than MCMC, while maintaining comparable accuracy.
翻译:我们考虑在Probit链接下对二元结果进行贝叶斯变量选择,并在回归系数上采用尖峰-平板先验分布。受高维场景下马尔可夫链蒙特卡洛(MCMC)采样器计算挑战的启发,我们提出了一种均值场变分贝叶斯近似方法,其中所有变分因子均具有闭式更新形式,且证据下界也可解析表示。这进而允许开发一种高效的坐标上升变分推断算法,以寻找变分参数的最优值。该方法能够生成后验包含概率和参数估计,从而在单一框架内实现可解释的选择与预测。如模拟数据和实际数据应用所示,所提出的方法能成功识别重要变量,其计算速度比MCMC方法快数个数量级,同时保持相当的准确性。