In this work, we developed a new Bayesian method for variable selection in function-on-scalar regression (FOSR). Our method uses a hierarchical Bayesian structure and latent variables to enable an adaptive covariate selection process for FOSR. Extensive simulation studies show the proposed method's main properties, such as its accuracy in estimating the coefficients and high capacity to select variables correctly. Furthermore, we conducted a substantial comparative analysis with the main competing methods, the BGLSS (Bayesian Group Lasso with Spike and Slab prior) method, the group LASSO (Least Absolute Shrinkage and Selection Operator), the group MCP (Minimax Concave Penalty), and the group SCAD (Smoothly Clipped Absolute Deviation). Our results demonstrate that the proposed methodology is superior in correctly selecting covariates compared with the existing competing methods while maintaining a satisfactory level of goodness of fit. In contrast, the competing methods could not balance selection accuracy with goodness of fit. We also considered a COVID-19 dataset and some socioeconomic data from Brazil as an application and obtained satisfactory results. In short, the proposed Bayesian variable selection model is highly competitive, showing significant predictive and selective quality.
翻译:本研究提出了一种用于函数型标量回归(FOSR)变量选择的新贝叶斯方法。该方法采用分层贝叶斯结构与潜变量机制,实现了FOSR的自适应协变量选择过程。大量模拟研究表明,所提方法具有估计系数精度高、变量选择能力强等主要特性。此外,我们与主要竞争方法——BGLSS(基于Spike and Slab先验的贝叶斯组LASSO)方法、组LASSO(最小绝对收缩与选择算子)、组MCP(极小极大凹惩罚)和组SCAD(平滑剪切绝对偏差)——进行了实质性比较分析。结果表明,与现有竞争方法相比,所提方法在保持满意拟合优度的同时,能更准确地选择协变量。相比之下,竞争方法无法平衡选择精度与拟合优度。我们还以巴西COVID-19数据集及部分社会经济数据作为应用案例,获得了满意结果。简言之,所提出的贝叶斯变量选择模型具有较强的竞争力,展现出显著的预测性能与变量选择质量。