We consider kernel-based learning in samplet coordinates with l1-regularization. The application of an l1-regularization term enforces sparsity of the coefficients with respect to the samplet basis. Therefore, we call this approach samplet basis pursuit. Samplets are wavelet-type signed measures, which are tailored to scattered data. They provide similar properties as wavelets in terms of localization, multiresolution analysis, and data compression. The class of signals that can sparsely be represented in a samplet basis is considerably larger than the class of signals which exhibit a sparse representation in the single-scale basis. In particular, every signal that can be represented by the superposition of only a few features of the canonical feature map is also sparse in samplet coordinates. We propose the efficient solution of the problem under consideration by combining soft-shrinkage with the semi-smooth Newton method and compare the approach to the fast iterative shrinkage thresholding algorithm. We present numerical benchmarks as well as applications to surface reconstruction from noisy data and to the reconstruction of temperature data using a dictionary of multiple kernels.
翻译:我们考虑在样本坐标系下结合l1正则化的核方法学习。l1正则化项的应用强制了样本基下系数的稀疏性,因此我们将该方法称为样本基追踪。样本基是专门针对散乱数据设计的类小波有符号测度,其在局部化、多分辨率分析和数据压缩方面具有与小波相似的性质。在样本基中能稀疏表示的信号类别,远大于能在单尺度基中实现稀疏表示的信号类别。特别地,任何可通过少量规范特征映射特征叠加表示的信号,在样本坐标系中同样具有稀疏性。我们通过将软阈值方法与半光滑牛顿法相结合,提出了所研究问题的高效求解方案,并与快速迭代收缩阈值算法进行了对比。文中给出了数值基准测试,以及基于含噪数据的曲面重建和采用多核字典进行温度数据重建等应用案例。