In the context of kernel machines, polynomial and Fourier features are commonly used to provide a nonlinear extension to linear models by mapping the data to a higher-dimensional space. Unless one considers the dual formulation of the learning problem, which renders exact large-scale learning unfeasible, the exponential increase of model parameters in the dimensionality of the data caused by their tensor-product structure prohibits to tackle high-dimensional problems. One of the possible approaches to circumvent this exponential scaling is to exploit the tensor structure present in the features by constraining the model weights to be an underparametrized tensor network. In this paper we quantize, i.e. further tensorize, polynomial and Fourier features. Based on this feature quantization we propose to quantize the associated model weights, yielding quantized models. We show that, for the same number of model parameters, the resulting quantized models have a higher bound on the VC-dimension as opposed to their non-quantized counterparts, at no additional computational cost while learning from identical features. We verify experimentally how this additional tensorization regularizes the learning problem by prioritizing the most salient features in the data and how it provides models with increased generalization capabilities. We finally benchmark our approach on large regression task, achieving state-of-the-art results on a laptop computer.
翻译:在核方法的背景下,多项式特征与傅里叶特征常被用于通过将数据映射到高维空间,为线性模型提供非线性扩展。然而,若考虑学习问题的对偶形式(这将导致大规模精确学习不可行),则由于张量乘积结构导致的模型参数随数据维度呈指数增长,阻碍了高维问题的处理。规避这种指数扩展的可行方法之一,是利用特征中存在的张量结构,将模型权重约束为欠参数化的张量网络。本文对多项式特征与傅里叶特征进行量化,即进一步实现张量化。基于这种特征量化,我们提出对相应模型权重进行量化,从而得到量化模型。研究表明:在相同模型参数数量下,相较于非量化对应模型,所得量化模型的VC维上界更高,且在学习相同特征时无需额外计算成本。我们通过实验验证了这种额外张量化如何通过优先保留数据中最显著的特征来规整化学习问题,以及如何提升模型的泛化能力。最终,我们在大规模回归任务上对方法进行基准测试,在笔记本电脑上达到了当前最优结果。