Semi-functional linear regression models postulate a linear relationship between a scalar response and a functional covariate, and also include a non-parametric component involving a univariate explanatory variable. It is of practical importance to obtain estimators for these models that are robust against high-leverage outliers, which are generally difficult to identify and may cause serious damage to least squares and Huber-type $M$-estimators. For that reason, robust estimators for semi-functional linear regression models are constructed combining $B$-splines to approximate both the functional regression parameter and the nonparametric component with robust regression estimators based on a bounded loss function and a preliminary residual scale estimator. Consistency and rates of convergence for the proposed estimators are derived under mild regularity conditions. The reported numerical experiments show the advantage of the proposed methodology over the classical least squares and Huber-type $M$-estimators for finite samples. The analysis of real examples illustrate that the robust estimators provide better predictions for non-outlying points than the classical ones, and that when potential outliers are removed from the training and test sets both methods behave very similarly.
翻译:半函数线性回归模型假设标量响应与函数协变量之间存在线性关系,并包含涉及单变量解释变量的非参数分量。在实际应用中,获得对这些模型中高杠杆异常值具有稳健性的估计量至关重要——这类异常值通常难以识别,且可能对最小二乘法和Huber型M估计量造成严重损害。为此,本文结合B样条近似函数回归参数和非参数分量,并基于有界损失函数及初步残差尺度估计构建稳健回归估计量,从而得到半函数线性回归模型的稳健估计。在温和正则性条件下,推导了所提估计量的一致性和收敛速度。数值实验表明,对于有限样本,所提方法优于经典最小二乘法和Huber型M估计量。真实数据分析显示,稳健估计量对非异常点的预测效果优于经典方法;当从训练集和测试集剔除潜在异常值时,两种方法表现非常相似。