In practical regression applications, multiple covariates are often measured, but not all may be associated with the response variable. Identifying and including only the relevant covariates in the model is crucial for improving prediction accuracy. In this work, we develop a variational inference approach for estimation and variable selection in scalar-on-function regression, involving only functional covariates, and in partially functional regression models that also include scalar covariates. Specifically, we develop a variational expectation-maximization (VEM) algorithm, with a variational Bayes procedure implemented in the E-step to obtain approximate marginal posterior distributions for most model parameters, except for the regularization parameters, which are updated in the M-step. Our method accurately identifies relevant covariates while maintaining strong predictive performance, as demonstrated through extensive simulation studies across diverse scenarios. Compared with alternative approaches, including BGLSS (Bayesian Group Lasso with Spike-and-Slab priors), grLASSO (group Least Absolute Shrinkage and Selection Operator), grMCP (group Minimax Concave Penalty), and grSCAD (group Smoothly Clipped Absolute Deviation), our approach achieves a superior balance between goodness-of-fit and sparsity in most scenarios. We further illustrate its practical utility through real-data applications involving spectral analysis of sugar samples and weather measurements from Japan.
翻译:在实际回归应用中,常需测量多个协变量,但并非所有协变量均与响应变量相关。在模型中仅识别并纳入相关协变量对于提升预测精度至关重要。本研究针对仅包含函数型协变量的标量对函数回归模型,以及同时包含标量协变量的部分函数回归模型,开发了一种用于估计与变量选择的变分推断方法。具体而言,我们设计了一种变分期望最大化算法,其中E步采用变分贝叶斯程序来获取除正则化参数外大多数模型参数的近似边缘后验分布,正则化参数则在M步进行更新。通过多种场景下的广泛模拟研究证明,我们的方法在保持较强预测性能的同时,能准确识别相关协变量。与BGLSS(带尖峰-厚尾先验的贝叶斯群组LASSO)、grLASSO(群组最小绝对收缩与选择算子)、grMCP(群组极小极大凹惩罚)及grSCAD(群组平滑剪切绝对偏差)等替代方法相比,本方法在大多数场景下实现了拟合优度与稀疏性之间的更优平衡。我们进一步通过糖样本光谱分析与日本气象测量等实际数据应用,验证了该方法的实用价值。