An approximate control variates approach to multifidelity distribution estimation

Forward simulation-based uncertainty quantification that studies the output distribution of quantities of interest (QoI) is a crucial component for computationally robust statistics and engineering. There is a large body of literature devoted to accurately assessing statistics of QoI, and in particular, multilevel or multifidelity approaches are known to be effective, leveraging cost-accuracy tradeoffs between a given ensemble of models. However, effective algorithms that can estimate the full distribution of outputs are still under active development. In this paper, we introduce a general multifidelity framework for estimating the cumulative distribution functions (CDFs) of vector-valued QoI associated with a high-fidelity model under a budget constraint. Given a family of appropriate control variates obtained from lower fidelity surrogates, our framework involves identifying the most cost-effective model subset and then using it to build an approximate control variates estimator for the target CDF. We instantiate the framework by constructing a family of control variates using intermediate linear approximators and rigorously analyze the corresponding algorithm. Our analysis reveals that the resulting CDF estimator is uniformly consistent and budget-asymptotically optimal, with only mild moment and regularity assumptions. The approach provides a robust multifidelity CDF estimator that is adaptive to the available budget, does not require \textit{a priori} knowledge of cross-model statistics or model hierarchy, and is applicable to general output dimensions. We demonstrate the efficiency and robustness of the approach using several test examples.

翻译：基于前向模拟的不确定性量化，研究感兴趣量（QoI）的输出分布，是计算稳健统计学和工程学中的关键组成部分。已有大量文献致力于精确评估QoI的统计量，尤其是多层级或多保真度方法，通过利用给定模型集合中成本与精度之间的权衡，被证明是有效的。然而，能够估计完整输出分布的有效算法仍处于积极开发阶段。本文提出了一种通用的多保真度框架，用于在预算约束下估计与高保真度模型相关的向量值QoI的累积分布函数（CDF）。给定从低保真度替代模型中获得的合适控制变量族，我们的框架涉及识别最具成本效益的模型子集，然后利用该子集构建目标CDF的近似控制变量估计量。我们通过使用中间线性近似器构造控制变量族来实例化该框架，并对相应算法进行了严格分析。分析表明，所得的CDF估计量在仅有轻微矩和正则性假设下，具有一致性和预算渐近最优性。该方法提供了一种稳健的多保真度CDF估计量，能自适应于可用预算，无需先验已知跨模型统计量或模型层级，且适用于一般输出维度。我们通过多个测试实例展示了该方法的效率和稳健性。