Kernel methods are learning algorithms that enjoy solid theoretical foundations while suffering from important computational limitations. Sketching, which consists in looking for solutions among a subspace of reduced dimension, is a well studied approach to alleviate these computational burdens. However, statistically-accurate sketches, such as the Gaussian one, usually contain few null entries, such that their application to kernel methods and their non-sparse Gram matrices remains slow in practice. In this paper, we show that sparsified Gaussian (and Rademacher) sketches still produce theoretically-valid approximations while allowing for important time and space savings thanks to an efficient \emph{decomposition trick}. To support our method, we derive excess risk bounds for both single and multiple output kernel problems, with generic Lipschitz losses, hereby providing new guarantees for a wide range of applications, from robust regression to multiple quantile regression. Our theoretical results are complemented with experiments showing the empirical superiority of our approach over SOTA sketching methods.
翻译:核方法是一种学习算法,具有坚实的理论基础,但计算开销巨大。草图法是一种通过将解限定在降维子空间内来减轻计算负担的经典方法。然而,高统计精度的草图(例如高斯草图)通常包含极少的零项,导致其在应用于核方法及其非稀疏 Gram 矩阵时实际计算速度仍然缓慢。本文证明,稀疏化的高斯(及 Rademacher)草图仍能产生理论上有效的近似,同时通过一种高效的分解技巧显著节省时空开销。为支持我们的方法,我们针对单输出和多输出核问题推导了通用 Lipschitz 损失下的过量风险界,从而为从稳健回归到多分位数回归等广泛的应用场景提供了新的理论保证。实验结果进一步表明,我们的方法在经验性能上优于当前最先进的草图方法。