In the framework of scalar-on-function regression models, in which several functional variables are employed to predict a scalar response, we propose a methodology for selecting relevant functional predictors while simultaneously providing accurate smooth (or, more generally, regular) estimates of the functional coefficients. We suppose that the functional predictors belong to a real separable Hilbert space, while the functional coefficients belong to a specific subspace of this Hilbert space. Such a subspace can be a Reproducing Kernel Hilbert Space (RKHS) to ensure the desired regularity characteristics, such as smoothness or periodicity, for the coefficient estimates. Our procedure, called SOFIA (Scalar-On-Function Integrated Adaptive Lasso), is based on an adaptive penalized least squares algorithm that leverages functional subgradients to efficiently solve the minimization problem. We demonstrate that the proposed method satisfies the functional oracle property, even when the number of predictors exceeds the sample size. SOFIA's effectiveness in variable selection and coefficient estimation is evaluated through extensive simulation studies and a real-data application to GDP growth prediction.
翻译:在标量-函数回归模型的框架下,即利用多个函数型变量预测标量响应变量的场景中,我们提出了一种同时实现相关函数预测变量选择与函数系数光滑(或更一般地,正则)估计的方法。我们假设函数预测变量属于实可分希尔伯特空间,而函数系数属于该希尔伯特空间的特定子空间。此类子空间可以是再生核希尔伯特空间(RKHS),以确保系数估计具备所需的正则特性,如光滑性或周期性。我们提出的SOFIA(标量-函数集成自适应Lasso)方法基于自适应惩罚最小二乘算法,该算法利用函数次梯度高效求解最小化问题。我们证明了所提方法满足函数型预言性质,即使在预测变量数量超过样本量的情况下依然成立。通过大量模拟研究以及GDP增长预测的实际数据应用,评估了SOFIA在变量选择与系数估计方面的有效性。