Nonparametric regression models offer a way to understand and quantify relationships between variables without having to identify an appropriate family of possible regression functions. Although many estimation methods for these models have been proposed in the literature, most of them can be highly sensitive to the presence of a small proportion of atypical observations in the training set. In this paper we review outlier robust estimation methods for nonparametric regression models, paying particular attention to practical considerations. Since outliers can also influence negatively the regression estimator by affecting the selection of bandwidths or smoothing parameters, we also discuss available robust alternatives for this task. Finally, since using many of the ``classical'' nonparametric regression estimators (and their robust counterparts) can be very challenging in settings with a moderate or large number of explanatory variables, we review recent robust nonparametric regression methods that scale well with a growing number of covariates.
翻译:非参数回归模型提供了一种理解和量化变量之间关系的方式,无需事先确定可能的回归函数族。虽然文献中已提出多种针对这些模型的估计方法,但大多数方法对训练集中少量异常观测值的存在高度敏感。本文综述了适用于非参数回归模型的离群值稳健估计方法,特别关注实践层面的考量。由于异常值还可能通过影响带宽或平滑参数的选择而对回归估计产生负面影响,我们也讨论了当前可用的稳健替代方案。最后,考虑到许多"经典"非参数回归估计器(及其稳健版本)在中高维解释变量场景下的应用颇具挑战性,我们回顾了近年来能随协变量数量增长而良好扩展的稳健非参数回归方法。