The Highly Adaptive Lasso (HAL) is a nonparametric regression method that achieves almost dimension-free convergence rates under minimal smoothness assumptions, but its implementation can be computationally prohibitive in high dimensions due to the large basis matrix it requires. The Highly Adaptive Ridge (HAR) has been proposed as a scalable alternative. Building on both procedures, we introduce the Principal Component based Highly Adaptive Lasso (PCHAL) and Principal Component based Highly Adaptive Ridge (PCHAR). These estimators constitute an outcome-blind dimension reduction which offer substantial gains in computational efficiency and match the empirical performances of HAL and HAR. We also uncover a striking spectral link between the leading principal components of the HAL/HAR Gram operator and a discrete sinusoidal basis, revealing an explicit Fourier-type structure underlying the PC truncation.
翻译:高度自适应套索(HAL)是一种非参数回归方法,它在最小平滑性假设下实现了几乎无维度的收敛速率,但由于其所需的基础矩阵规模庞大,在高维情况下的计算实现可能具有极高的计算成本。高度自适应岭回归(HAR)已被提出作为一种可扩展的替代方案。基于这两种方法,我们引入了基于主成分的高度自适应套索(PCHAL)和基于主成分的高度自适应岭回归(PCHAR)。这些估计器构成了一种结果盲降维方法,在计算效率上提供了显著提升,并匹配了HAL和HAR的实证性能。我们还揭示了HAL/HAR Gram算子的主导主成分与离散正弦基之间惊人的谱联系,从而揭示了PC截断背后明确的傅里叶型结构。