We describe a fast computation method for leave-one-out cross-validation (LOOCV) for $k$-nearest neighbours ($k$-NN) regression. We show that, under a tie-breaking condition for nearest neighbours, the LOOCV estimate of the mean square error for $k$-NN regression is identical to the mean square error of $(k+1)$-NN regression evaluated on the training data, multiplied by the scaling factor $(k+1)^2/k^2$. Therefore, to compute the LOOCV score, one only needs to fit $(k+1)$-NN regression only once, and does not need to repeat training-validation of $k$-NN regression for the number of training data. Numerical experiments confirm the validity of the fast computation method.
翻译:针对$k$-近邻($k$-NN)回归问题,提出一种留一交叉验证(LOOCV)的快速计算方法。我们证明,在最近邻破平条件下,$k$-NN回归均方误差的LOOCV估计量等于对训练数据计算的$(k+1)$-NN回归均方误差乘以缩放因子$(k+1)^2/k^2$。因此,计算LOOCV分数时仅需对$(k+1)$-NN回归进行一次拟合,无需对训练数据进行$k$-NN回归的重复训练-验证循环。数值实验验证了该快速计算方法的有效性。