Kernel methods are learning algorithms that enjoy solid theoretical foundations while suffering from important computational limitations. Sketching, which consists in looking for solutions among a subspace of reduced dimension, is a well studied approach to alleviate these computational burdens. However, statistically-accurate sketches, such as the Gaussian one, usually contain few null entries, such that their application to kernel methods and their non-sparse Gram matrices remains slow in practice. In this paper, we show that sparsified Gaussian (and Rademacher) sketches still produce theoretically-valid approximations while allowing for important time and space savings thanks to an efficient \emph{decomposition trick}. To support our method, we derive excess risk bounds for both single and multiple output kernel problems, with generic Lipschitz losses, hereby providing new guarantees for a wide range of applications, from robust regression to multiple quantile regression. Our theoretical results are complemented with experiments showing the empirical superiority of our approach over SOTA sketching methods.
翻译:核方法是一种具有坚实理论基础但面临严重计算局限性的学习算法。草图(Sketching)作为一种通过降低维度子空间寻找解的方法,是缓解这些计算负担的重要研究途径。然而,具有统计精度的草图(如高斯草图)通常含有极少数的零元素,导致其应用于核方法及其非稀疏格拉姆矩阵时,实际计算速度仍然较慢。本文证明,稀疏化高斯(及Rademacher)草图不仅能生成理论上有效的近似,还能通过一种高效的分解技巧实现显著的时间和空间节省。为支撑该方法,我们推导了单输出与多输出核问题在通用Lipschitz损失下的过量风险界,从而为从鲁棒回归到多分位数回归等广泛的应用场景提供了新的理论保证。理论结果与实验验证相结合,表明我们的方法在经验性能上优于现有最优的草图方法。