We investigate $L_2$ boosting in the context of kernel regression. Kernel smoothers, in general, lack appealing traits like symmetry and positive definiteness, which are critical not only for understanding theoretical aspects but also for achieving good practical performance. We consider a projection-based smoother (Huang and Chen, 2008) that is symmetric, positive definite, and shrinking. Theoretical results based on the orthonormal decomposition of the smoother reveal additional insights into the boosting algorithm. In our asymptotic framework, we may replace the full-rank smoother with a low-rank approximation. We demonstrate that the smoother's low-rank ($d(n)$) is bounded above by $O(h^{-1})$, where $h$ is the bandwidth. Our numerical findings show that, in terms of prediction accuracy, low-rank smoothers may outperform full-rank smoothers. Furthermore, we show that the boosting estimator with low-rank smoother achieves the optimal convergence rate. Finally, to improve the performance of the boosting algorithm in the presence of outliers, we propose a novel robustified boosting algorithm which can be used with any smoother discussed in the study. We investigate the numerical performance of the proposed approaches using simulations and a real-world case.
翻译:我们研究了核回归背景下的 $L_2$ Boosting。核平滑器通常缺乏对称性和正定性等吸引人的特性,而这些特性对于理解理论方面以及实现良好的实际性能都至关重要。我们考虑了一种基于投影的平滑器(Huang and Chen, 2008),该平滑器具有对称、正定和收缩的性质。基于平滑器正交分解的理论结果揭示了关于Boosting算法的更多见解。在我们的渐近框架中,我们可以用低秩近似替代满秩平滑器。我们证明平滑器的低秩 $d(n)$ 上界为 $O(h^{-1})$,其中 $h$ 是带宽。数值结果表明,在预测精度方面,低秩平滑器可能优于满秩平滑器。此外,我们证明使用低秩平滑器的Boosting估计量能够达到最优收敛速度。最后,为了在存在异常值时提升Boosting算法的性能,我们提出了一种新颖的鲁棒化Boosting算法,该算法可与本研究中讨论的任何平滑器配合使用。我们通过模拟和实际案例研究了所提方法的数值性能。