We consider the $\mathcal{H}^2$-formatted compression and computational estimation of covariance functions on a compact set in $\mathbb{R}^d$. The classical sample covariance or Monte Carlo estimator is prohibitively expensive for many practically relevant problems, where often approximation spaces with many degrees of freedom and many samples for the estimator are needed. In this article, we propose and analyze a data sparse multilevel sample covariance estimator, i.e., a multilevel Monte Carlo estimator. For this purpose, we generalize the notion of asymptotically smooth kernel functions to a Gevrey type class of kernels for which we derive new variable-order $\mathcal{H}^2$-approximation rates. These variable-order $\mathcal{H}^2$-approximations can be considered as a variant of $hp$-approximations. Our multilevel sample covariance estimator then uses an approximate multilevel hierarchy of variable-order $\mathcal{H}^2$-approximations to compress the sample covariances on each level. The non-nestedness of the different levels makes the reduction to the final estimator nontrivial and we present a suitable algorithm which can handle this task in linear complexity. This allows for a data sparse multilevel estimator of Gevrey covariance kernel functions in the best possible complexity for Monte Carlo type multilevel estimators, which is quadratic. Numerical examples which estimate covariance matrices with tens of billions of entries are presented.
翻译:我们考虑定义在$\mathbb{R}^d$紧集上的协方差函数的$\mathcal{H}^2$格式压缩与计算估计。对于许多实际问题,经典样本协方差或蒙特卡洛估计器由于需要高维逼近空间和大量样本而代价高昂。本文提出并分析了一种数据稀疏的多级样本协方差估计器,即多级蒙特卡洛估计器。为此,我们将渐近光滑核函数的概念推广至Gevrey类核函数,并推导了这类核函数的新变阶$\mathcal{H}^2$逼近率。这些变阶$\mathcal{H}^2$逼近可视为$hp$逼近的变体。我们的多级样本协方差估计器采用变阶$\mathcal{H}^2$逼近的近似多级层次结构,对每层样本协方差进行压缩。不同层级之间的非嵌套性使最终估计器的约化变得复杂,我们提出了一种能以线性复杂度处理该任务的算法。这使得针对Gevrey协方差核函数的数据稀疏多级估计器能够达到蒙特卡洛型多级估计器的最优复杂度,即二次复杂度。最后给出了估计包含数百亿条目协方差矩阵的数值算例。