Fast and interpretable Support Vector Classification based on the truncated ANOVA decomposition

Support Vector Machines (SVMs) are an important tool for performing classification on scattered data, where one usually has to deal with many data points in high-dimensional spaces. We propose solving SVMs in primal form using feature maps based on trigonometric functions or wavelets. In small dimensional settings the Fast Fourier Transform (FFT) and related methods are a powerful tool in order to deal with the considered basis functions. For growing dimensions the classical FFT-based methods become inefficient due to the curse of dimensionality. Therefore, we restrict ourselves to multivariate basis functions, each one of them depends only on a small number of dimensions. This is motivated by the well-known sparsity of effects and recent results regarding the reconstruction of functions from scattered data in terms of truncated analysis of variance (ANOVA) decomposition, which makes the resulting model even interpretable in terms of importance of the features as well as their couplings. The usage of small superposition dimensions has the consequence that the computational effort no longer grows exponentially but only polynomially with respect to the dimension. In order to enforce sparsity regarding the basis coefficients, we use the frequently applied $\ell_2$-norm and, in addition, $\ell_1$-norm regularization. The found classifying function, which is the linear combination of basis functions, and its variance can then be analyzed in terms of the classical ANOVA decomposition of functions. Based on numerical examples we show that we are able to recover the signum of a function that perfectly fits our model assumptions. We obtain better results with $\ell_1$-norm regularization, both in terms of accuracy and clarity of interpretability.

翻译：支持向量机(SVM)是处理散乱数据分类的重要工具，通常需要处理高维空间中的大量数据点。本文提出利用基于三角函数或小波的特征映射来求解原始形式的SVM。在低维场景中，快速傅里叶变换(FFT)及相关方法成为处理所考虑基函数的强大工具。随着维数增长，经典FFT方法因维数灾难而变得低效。因此，我们将多变量基函数限制为仅依赖少量维度的函数。这一方法基于公认的效应稀疏性以及通过截断方差分析(ANOVA)分解从散乱数据重建函数的最新研究成果，使得最终模型在特征重要性及其耦合效应方面具有可解释性。采用小叠加维数使得计算量不再随维数呈指数增长，而仅呈多项式增长。为强制基系数的稀疏性，我们同时使用了常用的$\ell_2$范数和$\ell_1$范数正则化。所得到的分类函数（即基函数的线性组合）及其方差可通过经典ANOVA函数分解进行分析。数值实验表明，该方法能够完美恢复符合模型假设的函数的符号。在精度和可解释性清晰度两方面，$\ell_1$范数正则化均获得更优结果。