This manuscript addresses the problem of approximating an unknown function from point evaluations. When obtaining these point evaluations is costly, minimising the required sample size becomes crucial, and it is unreasonable to reserve a sufficiently large test sample for estimating the approximation accuracy. Therefore, an approximation with a certified quasi-optimality factor is required. This article shows that such an approximation can be obtained when the sought function lies in a reproducing kernel Hilbert space (RKHS) and is to be approximated in a finite-dimensional linear subspace. However, selecting the sample points to minimise the quasi-optimality factor requires optimising over an infinite set of points and computing exact inner products in RKHS, which is often infeasible in practice. Extending results from optimal sampling for $L^2$ approximation, the present manuscript proves that random points, drawn independently from the Christoffel sampling distribution associated with $\mathcal{V}_d$, can yield a controllable quasi-optimality factor with high probability. Inspired by this result, a novel sampling scheme, coined subspace-informed volume sampling, is introduced and evaluated in numerical experiments, where it outperforms classical i.i.d. Christoffel sampling and continuous volume sampling. To reduce the size of such a random sample, an additional greedy subsampling scheme with provable suboptimality bounds is introduced. Our presentation is of independent interest to the community researching the parametrised background data weak (PBDW) method, as it offers a simpler interpretation of the method.
翻译:本文研究基于点值估计未知函数的问题。当获取点值的代价较高时,最小化所需样本量至关重要,且为估计逼近精度而预留足够大的测试样本通常不切实际。因此,需要一种具有可证明拟最优性因子的逼近方法。本文证明,当目标函数位于再生核希尔伯特空间(RKHS)中且需在有限维线性子空间内逼近时,此类逼近是可行的。然而,为最小化拟最优性因子而选择样本点,需要在无限点集上进行优化并计算RKHS中的精确内积,这在实际中往往难以实现。通过扩展$L^2$逼近中最优采样的现有结果,本文证明从与$\mathcal{V}_d$相关的Christoffel采样分布中独立抽取的随机点,能以高概率产生可控的拟最优性因子。受此启发,本文提出了一种新颖的采样方案——子空间信息体积采样,并在数值实验中验证其性能优于经典的独立同分布Christoffel采样和连续体积采样。为缩减此类随机样本的规模,进一步引入了一种具有可证明次优性界的贪婪子采样方案。本研究对参数化背景数据弱(PBDW)方法的研究群体具有独立价值,因其为该提供了更简洁的诠释。