Multivariate adaptive regression splines (MARS) is a popular method for nonparametric regression introduced by Friedman in 1991. MARS fits simple nonlinear and non-additive functions to regression data. We propose and study a natural lasso variant of the MARS method. Our method is based on least squares estimation over a convex class of functions obtained by considering infinite-dimensional linear combinations of functions in the MARS basis and imposing a variation based complexity constraint. Our estimator can be computed via finite-dimensional convex optimization, although it is defined as a solution to an infinite-dimensional optimization problem. Under a few standard design assumptions, we prove that our estimator achieves a rate of convergence that depends only logarithmically on dimension and thus avoids the usual curse of dimensionality to some extent. We also show that our method is naturally connected to nonparametric estimation techniques based on smoothness constraints. We implement our method with a cross-validation scheme for the selection of the involved tuning parameter and compare it to the usual MARS method in various simulation and real data settings.
翻译:多元自适应回归样条(MARS)是Friedman于1991年提出的一种非参数回归的流行方法,通过拟合简单的非线性和非可加函数对回归数据进行建模。我们提出并研究了一种自然的MASS方法的lasso变体。该方法基于在凸函数类上的最小二乘估计,该函数类通过考虑MARS基函数中的无限维线性组合并施加基于变分的复杂度约束得到。尽管我们的估计量被定义为无限维优化问题的解,但它可通过有限维凸优化进行计算。在若干标准设计假设下,我们证明了该估计量实现了仅依赖于维度对数函数收敛速度,从而在一定程度上避免了通常的维数灾难。我们进一步展示了该方法与基于光滑性约束的非参数估计技术存在自然联系。通过交叉验证方案选择相关调优参数实现该方法,并在多种模拟和实际数据场景中与传统MARS方法进行了比较。