We consider scattered data approximation in samplet coordinates with $\ell_1$-regularization. The application of an $\ell_1$-regularization term enforces sparsity of the coefficients with respect to the samplet basis. Samplets are wavelet-type signed measures, which are tailored to scattered data. They provide similar properties as wavelets in terms of localization, multiresolution analysis, and data compression. By using the Riesz isometry, we embed samplets into reproducing kernel Hilbert spaces and discuss the properties of the resulting functions. We argue that the class of signals that are sparse with respect to the embedded samplet basis is considerably larger than the class of signals that are sparse with respect to the basis of kernel translates. Vice versa, every signal that is a linear combination of only a few kernel translates is sparse in samplet coordinates. Therefore, samplets enable the use of well-established multiresolution techniques on general scattered data sets. We propose the rapid solution of the problem under consideration by combining soft-shrinkage with the semi-smooth Newton method. Leveraging on the sparse representation of kernel matrices in samplet coordinates, this approach converges faster than the fast iterative shrinkage thresholding algorithm and is feasible for large-scale data. Numerical benchmarks are presented and demonstrate the superiority of the multiresolution approach over the single-scale approach. As large-scale applications, the surface reconstruction from scattered data and the reconstruction of scattered temperature data using a dictionary of multiple kernels are considered.
翻译:我们考虑在样本小波坐标下使用$\ell_1$正则化的散乱数据逼近问题。$\ell_1$正则化项的应用强制了系数相对于样本小波基的稀疏性。样本小波是针对散乱数据量身定制的类小波符号测度,在小波定位、多分辨率分析和数据压缩等方面具有类似性质。通过利用Riesz等距变换,我们将样本小波嵌入再生核希尔伯特空间,并讨论所得函数的性质。我们论证:相对于嵌入样本小波基稀疏的信号类别,远大于相对于核平移基稀疏的信号类别;反之,任何仅由少量核平移线性组合而成的信号,在样本小波坐标下都是稀疏的。因此,样本小波使得在一般散乱数据集上应用成熟的多分辨率技术成为可能。我们提出将软阈值法与半光滑牛顿法相结合,以快速求解该问题。利用核矩阵在样本小波坐标下的稀疏表示,该方法收敛速度快于快速迭代收缩阈值算法,且适用于大规模数据。数值基准测试表明,多分辨率方法优于单尺度方法。在大规模应用方面,我们考虑了散乱数据的表面重建以及使用多核字典的散乱温度数据重建问题。