Surrogate kernel-based methods offer a flexible solution to structured output prediction by leveraging the kernel trick in both input and output spaces. In contrast to energy-based models, they avoid to pay the cost of inference during training, while enjoying statistical guarantees. However, without approximation, these approaches are condemned to be used only on a limited amount of training data. In this paper, we propose to equip surrogate kernel methods with approximations based on sketching, seen as low rank projections of feature maps both on input and output feature maps. We showcase the approach on Input Output Kernel ridge Regression (or Kernel Dependency Estimation) and provide excess risk bounds that can be in turn directly plugged on the final predictive model. An analysis of the complexity in time and memory show that sketching the input kernel mostly reduces training time while sketching the output kernel allows to reduce the inference time. Furthermore, we show that Gaussian and sub-Gaussian sketches are admissible sketches in the sense that they induce projection operators ensuring a small excess risk. Experiments on different tasks consolidate our findings.
翻译:基于替代核函数的方法通过在输入和输出空间中利用核技巧,为结构化输出预测提供了灵活的解决方案。与基于能量的模型不同,这些方法在训练过程中避免了推理成本,同时享有统计保证。然而,若无近似,这些方法只能用于有限量的训练数据。本文提出将替代核函数方法与基于草图的近似相结合,草图被视为输入和输出特征图上的低秩投影。我们在输入输出核岭回归(或核依赖估计)上展示了该方法,并提供了可直接用于最终预测模型的超额风险界。时间和内存复杂度分析表明,对输入核进行草图化主要减少训练时间,而对输出核进行草图化则可减少推理时间。此外,我们证明高斯和亚高斯草图是可容许的草图,因为它们诱导的投影算子能确保较小的超额风险。不同任务上的实验验证了我们的发现。