Monte Carlo is a good sampling strategy for polynomial approximation in high dimensions

This paper concerns the approximation of smooth, high-dimensional functions from limited samples using polynomials. This task lies at the heart of many applications in computational science and engineering - notably, some of those arising from parametric modelling and computational uncertainty quantification. It is common to use Monte Carlo sampling in such applications, so as not to succumb to the curse of dimensionality. However, it is well known that such a strategy is theoretically suboptimal. Specifically, there are many polynomial spaces of dimension $n$ for which the sample complexity scales log-quadratically, i.e., like $c \cdot n^2 \cdot \log(n)$ as $n \rightarrow \infty$. This well-documented phenomenon has led to a concerted effort over the last decade to design improved, and moreover, near-optimal strategies, whose sample complexities scale log-linearly, or even linearly in $n$. In this work we demonstrate that Monte Carlo is actually a perfectly good strategy in high dimensions, despite its apparent suboptimality. We first document this phenomenon empirically via a systematic set of numerical experiments. Next, we present a theoretical analysis that rigorously justifies this fact in the case of holomorphic functions of infinitely-many variables. We show that there is a least-squares approximation based on $m$ Monte Carlo samples whose error decays algebraically fast in $m/\log(m)$, with a rate that is the same as that of the best $n$-term polynomial approximation. This result is non-constructive, since it assumes knowledge of a suitable polynomial subspace in which to perform the approximation. We next present a compressed sensing-based scheme that achieves the same rate, except for a larger polylogarithmic factor. This scheme is practical, and numerically it performs as well as or better than well-known adaptive least-squares schemes.

翻译：本文关注利用多项式从有限样本中逼近光滑高维函数的问题。该任务是计算科学与工程众多应用的核心——尤其是涉及参数化建模和计算不确定性量化的场景。此类应用中通常采用蒙特卡洛采样以避免维数灾难，然而此策略在理论上已知具有次优性。具体而言，存在大量维数为$n$的多项式空间，其样本复杂度呈对数二次缩放，即当$n \rightarrow \infty$时复杂度约为$c \cdot n^2 \cdot \log(n)$。这一长期记录的现象促使过去十年学界致力于设计改进的、乃至接近最优的策略，其样本复杂度可实现对数线性甚至线性于$n$的缩放。然而本文证明：尽管蒙特卡洛表面次优，在高维场景中它实际上是完全良好的策略。我们首先通过系统性数值实验从经验上验证该现象，继而针对无限变量全纯函数给出严格理论分析。我们证明：基于$m$个蒙特卡洛样本的最小二乘逼近能以$m/\log(m)$的代数衰减速率收敛，该速率与最佳$n$项多项式逼近一致。此结果虽为非构造性的（需要预知某个合适的多项式子空间用于逼近），我们进一步提出了基于压缩感知的实用化方案，除较大的多对数因子外可实现相同收敛速率。该方案不仅具有实用性，数值表现亦优于或等同于知名的自适应最小二乘方法。