This paper introduces two randomized preconditioning techniques for robustly solving kernel ridge regression (KRR) problems with a medium to large number of data points ($10^4 \leq N \leq 10^7$). The first method, RPCholesky preconditioning, is capable of accurately solving the full-data KRR problem in $O(N^2)$ arithmetic operations, assuming sufficiently rapid polynomial decay of the kernel matrix eigenvalues. The second method, KRILL preconditioning, offers an accurate solution to a restricted version of the KRR problem involving $k \ll N$ selected data centers at a cost of $O((N + k^2) k \log k)$ operations. The proposed methods solve a broad range of KRR problems and overcome the failure modes of previous KRR preconditioners, making them ideal for practical applications.
翻译:本文提出了两种随机预处理技术,用于稳健地解决具有中等至大量数据点($10^4 \leq N \leq 10^7$)的核岭回归问题。第一种方法——RPCholesky预处理,能够在假设核矩阵特征值具有足够快多项式衰减的条件下,以$O(N^2)$次算术运算精确求解全数据核岭回归问题。第二种方法——KRILL预处理,能够以$O((N + k^2) k \log k)$次运算成本精确求解涉及$k \ll N$个选定数据中心的受限版本核岭回归问题。所提出的方法可解决广泛的核岭回归问题,并克服了先前核岭回归预处理器的失效模式,使其成为实际应用的理想选择。