Multilevel Surrogate-based Control Variates

Monte Carlo (MC) sampling is a popular method for estimating the statistics (e.g. expectation and variance) of a random variable. Its slow convergence has led to the emergence of advanced techniques to reduce the variance of the MC estimator for the outputs of computationally expensive solvers. The control variates (CV) method corrects the MC estimator with a term derived from auxiliary random variables that are highly correlated with the original random variable. These auxiliary variables may come from surrogate models. Such a surrogate-based CV strategy is extended here to the multilevel Monte Carlo (MLMC) framework, which relies on a sequence of levels corresponding to numerical simulators with increasing accuracy and computational cost. MLMC combines output samples obtained across levels, into a telescopic sum of differences between MC estimators for successive fidelities. In this paper, we introduce three multilevel variance reduction strategies that rely on surrogate-based CV and MLMC. MLCV is presented as an extension of CV where the correction terms devised from surrogate models for simulators of different levels add up. MLMC-CV improves the MLMC estimator by using a CV based on a surrogate of the correction term at each level. Further variance reduction is achieved by using the surrogate-based CVs of all the levels in the MLMC-MLCV strategy. Alternative solutions that reduce the subset of surrogates used for the multilevel estimation are also introduced. The proposed methods are tested on a test case from the literature consisting of a spectral discretization of an uncertain 1D heat equation, where the statistic of interest is the expected value of the integrated temperature along the domain at a given time. The results are assessed in terms of the accuracy and computational cost of the multilevel estimators, depending on whether the construction of the surrogates, and the associated computational cost, precede the evaluation of the estimator. It was shown that when the lower fidelity outputs are strongly correlated with the high-fidelity outputs, a significant variance reduction is obtained when using surrogate models for the coarser levels only. It was also shown that taking advantage of pre-existing surrogate models proves to be an even more efficient strategy.

翻译：蒙特卡洛（MC）采样是估计随机变量统计量（如期望和方差）的常用方法。其收敛速度较慢，这促使了先进技术的发展，以减少针对计算成本高昂的求解器输出的MC估计量的方差。控制变量（CV）方法通过一个源自辅助随机变量的项来校正MC估计量，这些辅助变量与原随机变量高度相关。这些辅助变量可能来自代理模型。本文将这种基于代理模型的CV策略扩展到多级蒙特卡洛（MLMC）框架中，该框架依赖于一系列对应于精度和计算成本递增的数值模拟器的层级。MLMC将跨层级获得的输出样本组合成一个望远镜和，即逐级保真度的MC估计量之间的差值之和。在本文中，我们介绍了三种依赖于基于代理模型的CV和MLMC的多级方差缩减策略。MLCV被提出作为CV的扩展，其中由不同层级模拟器的代理模型设计的校正项相加。MLMC-CV通过在每个层级使用基于校正项代理模型的CV来改进MLMC估计量。通过在MLMC-MLCV策略中使用所有层级的基于代理模型的CV，实现了进一步的方差缩减。本文还介绍了减少用于多级估计的代理模型子集的替代方案。所提出的方法在一个文献中的测试案例上进行了验证，该案例涉及一个不确定一维热方程的谱离散化，其中感兴趣的统计量是给定时刻沿域积分温度的期望值。根据代理模型的构建及其相关计算成本是否在估计量评估之前进行，从多级估计量的精度和计算成本方面评估了结果。结果表明，当较低保真度的输出与高保真度输出强相关时，仅对较粗糙层级使用代理模型即可获得显著的方差缩减。研究还表明，利用预先存在的代理模型被证明是一种更为高效的策略。