Multilevel Surrogate-based Control Variates

Monte Carlo (MC) sampling is a popular method for estimating the statistics (e.g. expectation and variance) of a random variable. Its slow convergence has led to the emergence of advanced techniques to reduce the variance of the MC estimator for the outputs of computationally expensive solvers. The control variates (CV) method corrects the MC estimator with a term derived from auxiliary random variables that are highly correlated with the original random variable. These auxiliary variables may come from surrogate models. Such a surrogate-based CV strategy is extended here to the multilevel Monte Carlo (MLMC) framework, which relies on a sequence of levels corresponding to numerical simulators with increasing accuracy and computational cost. MLMC combines output samples obtained across levels, into a telescopic sum of differences between MC estimators for successive fidelities. In this paper, we introduce three multilevel variance reduction strategies that rely on surrogate-based CV and MLMC. MLCV is presented as an extension of CV where the correction terms devised from surrogate models for simulators of different levels add up. MLMC-CV improves the MLMC estimator by using a CV based on a surrogate of the correction term at each level. Further variance reduction is achieved by using the surrogate-based CVs of all the levels in the MLMC-MLCV strategy. Alternative solutions that reduce the subset of surrogates used for the multilevel estimation are also introduced. The proposed methods are tested on a test case from the literature consisting of a spectral discretization of an uncertain 1D heat equation, where the statistic of interest is the expected value of the integrated temperature along the domain at a given time. The results are assessed in terms of the accuracy and computational cost of the multilevel estimators, depending on whether the construction of the surrogates, and the associated computational cost, precede the evaluation of the estimator. It was shown that when the lower fidelity outputs are strongly correlated with the high-fidelity outputs, a significant variance reduction is obtained when using surrogate models for the coarser levels only. It was also shown that taking advantage of pre-existing surrogate models proves to be an even more efficient strategy.

翻译：蒙特卡洛（MC）抽样是估计随机变量统计量（如期望和方差）的常用方法。由于其收敛速度较慢，研究人员发展出先进技术以降低计算代价高昂求解器输出的MC估计量方差。控制变量法通过引入与原始随机变量高度相关的辅助随机变量项来修正MC估计量，这些辅助变量可来源于代理模型。本文将这种基于代理模型的控制变量法扩展到多层级蒙特卡洛（MLMC）框架中，该框架依赖于一系列对应不同精度与计算成本数值模拟器的层级，通过层级间MC估计量差值的望远镜求和方式组合各层级输出样本。本文提出三种基于代理模型控制变量法与MLMC的多层级方差缩减策略：MLCV作为控制变量法的扩展，累加不同层级模拟器代理模型构造的修正项；MLMC-CV通过在各层级基于修正项的代理模型构建控制变量来改进MLMC估计量；MLMC-MLCV策略则利用所有层级的代理模型控制变量实现进一步方差缩减。本文还介绍了减少多层级估计所用代理模型子集的替代方案。所提方法在文献中一个包含含一维不确定热方程谱离散化的测试案例上进行验证，关注统计量为给定时刻沿区域积分温度的期望值。根据代理模型构建及其相关计算成本是否先于估计量评估，从多层级估计量的精度与计算成本角度评估结果。研究表明：当低保真度输出与高保真度输出高度相关时，仅在较粗层级使用代理模型即可获得显著方差缩减；同时，利用预先存在的代理模型被证明是更高效的策略。