We study the asymptotic behavior of least-squares cross-validation bandwidth selection in kernel density estimation on the $d$-dimensional hypersphere, $d\geq 1$. We show that the exact rate of convergence with respect to the optimal bandwidth minimizing the mean integrated squared error, shown to exist under mild non-uniformity conditions, is $n^{-d/(2d+8)}$, thus approaching the $n^{-1/2}$ parametric rate as $d$ grows. This ``blessing of dimensionality'' in bandwidth selection offers theoretical support for utilizing the conceptually simpler cross-validation selector over plug-in techniques for larger dimensions $d$. We compare this result for bandwidth estimation on the $d$-dimensional Euclidean space through explicit expressions for the asymptotic variance functionals. Numerical experiments corroborate the speed of this convergence in an array of scenarios and dimensions, precisely illustrating the tipping dimension where cross-validation outperforms plug-in approaches.
翻译:我们研究了$d$维超球面($d\geq 1$)上核密度估计中最小二乘交叉验证带宽选择的渐近行为。我们证明,相对于使均方积分误差最小化的最优带宽(该最优带宽在温和的非均匀性条件下被证明存在),其精确收敛速率为$n^{-d/(2d+8)}$,从而随着$d$的增长趋近于$n^{-1/2}$的参数速率。带宽选择中的这种"维度福音"为在较大维度$d$下使用概念上更简单的交叉验证选择器而非插入式技术提供了理论支持。我们通过渐近方差泛函的显式表达式,将此结果与$d$维欧几里得空间上的带宽估计进行了比较。数值实验在一系列场景和维度中证实了这种收敛速度,精确地说明了交叉验证优于插入式方法的临界维度。