We consider Bayesian variable selection for binary outcomes under a probit link with a spike-and-slab prior on the regression coefficients. Motivated by the computational challenges encountered by Markov chain Monte Carlo (MCMC) samplers in high-dimensional regimes, we develop a mean-field variational Bayes approximation in which all variational factors admit closed-form updates, and the evidence lower bound is available in closed form. This, in turn, allows the development of an efficient coordinate ascent variational inference algorithm to find the optimal values of the variational parameters. The approach produces posterior inclusion probabilities and parameter estimates, enabling interpretable selection and prediction within a single framework. As shown in both simulated and real data applications, the proposed method successfully identifies the important variables and is orders of magnitude faster than MCMC, while maintaining comparable accuracy.
翻译:我们考虑在概率链接下对二元结果进行贝叶斯变量选择,并对回归系数采用尖峰-平板先验。受高维场景中马尔可夫链蒙特卡洛采样器所面临的计算挑战的驱动,我们开发了一种均值场变分贝叶斯近似方法,其中所有变分因子均具有封闭形式的更新,且证据下界可解析表达。这进而允许开发一种高效的坐标上升变分推断算法来求解变分参数的最优值。该方法可生成后验包含概率和参数估计,在单一框架内实现可解释的变量选择与预测。如模拟与实际数据应用所示,所提方法能成功识别重要变量,且计算速度比马尔可夫链蒙特卡洛快数个数量级,同时保持相当的精度。