In this paper, we will outline a novel data-driven method for estimating functions in a multivariate nonparametric regression model based on an adaptive knot selection for B-splines. The underlying idea of our approach for selecting knots is to apply the generalized lasso, since the knots of the B-spline basis can be seen as changes in the derivatives of the function to be estimated. This method was then extended to functions depending on several variables by processing each dimension independently, thus reducing the problem to a univariate setting. The regularization parameters were chosen by means of a criterion based on EBIC. The nonparametric estimator was obtained using a multivariate B-spline regression with the corresponding selected knots. Our procedure was validated through numerical experiments by varying the number of observations and the level of noise to investigate its robustness. The influence of observation sampling was also assessed and our method was applied to a chemical system commonly used in geoscience. For each different framework considered in this paper, our approach performed better than state-of-the-art methods. Our completely data-driven method is implemented in the glober R package which is available on the Comprehensive R Archive Network (CRAN).
翻译:本文提出了一种新的数据驱动方法,用于基于B样条自适应节点选择的多元非参数回归模型中的函数估计。该方法的核心思想是应用广义lasso进行节点选择,因为B样条基的节点可视为待估计函数导数的变化点。随后,通过独立处理每个维度,将该方法扩展到多变量函数,从而将问题简化为单变量情形。正则化参数基于EBIC准则进行选择。非参数估计量通过结合所选节点的多元B样条回归获得。我们通过改变观测数量和噪声水平进行数值实验,验证了该方法的鲁棒性。同时评估了观测采样对方法的影响,并将其应用于地球科学中常用的化学系统。本文所考虑的每种不同框架下,我们的方法均优于现有最先进方法。该完全数据驱动的方法已在glober R包中实现,并可从综合R存档网络(CRAN)获取。