Numerical models are widely used to simulate the earth system, but they are computationally expensive and often depend on many uncertain input parameters. Their effective use requires calibration and uncertainty quantification, which typically involve running the model across many input configurations and therefore incur substantial computational cost. Statistical emulation provides a practical alternative for efficiently exploring model behavior. We are motivated by the Arctic sea ice component of the Energy Exascale Earth System Model (MPAS-Seaice), which generates large spatiotemporal outputs at multiple spatial resolutions, with high-resolution (or high-fidelity, HF) simulations being more accurate but computationally more expensive than lower-resolution (low-fidelity, LF) simulations. Multi-fidelity (MF) emulation integrates information across resolutions to construct efficient and accurate surrogate models, yet existing approaches struggle to scale to large spatiotemporal data. We develop an MF emulator that combines tensor decomposition for dimensionality reduction, Gaussian process priors for flexible function approximation, and an additive discrepancy model to capture systematic differences between LF and HF data. The proposed framework enables scalable emulation while maintaining accurate predictions and well-calibrated uncertainty for complex spatiotemporal fields, and consistently achieves lower prediction error and reduced uncertainty than LF-only and HF-only models in both simulation studies and MPAS-Seaice analysis. By leveraging the complementary strengths of LF and HF data and using an efficient tensor decomposition approach, our emulator greatly reduces computational expense, making it well suited for large-scale simulation tasks involving complex physical models.
翻译:数值模型被广泛应用于模拟地球系统,但其计算成本高昂且常依赖于众多不确定输入参数。有效使用这些模型需要进行校准和不确定性量化,这通常涉及在多种输入配置下运行模型,从而产生巨大的计算开销。统计仿真为高效探索模型行为提供了实用替代方案。本研究受能源百亿亿次地球系统模型(MPAS-Seaice)中北极海冰组件的启发,该组件能在多种空间分辨率下生成大规模时空输出,其中高分辨率(或高保真度,HF)模拟虽精度更高,但计算成本远低于低分辨率(低保真度,LF)模拟。多保真度(MF)仿真通过整合不同分辨率的信息来构建高效精确的代理模型,然而现有方法难以扩展到大规模时空数据。我们开发了一种MF仿真器,其结合了用于降维的张量分解、用于灵活函数逼近的高斯过程先验,以及用于捕捉LF与HF数据间系统差异的加性差异模型。该框架在保持对复杂时空场精确预测和良好校准不确定性的同时,实现了可扩展的仿真能力。在模拟研究和MPAS-Seaice分析中,该模型始终比仅使用LF或仅使用HF的模型获得更低的预测误差和更小的不确定性。通过利用LF与HF数据的互补优势,并采用高效的张量分解方法,我们的仿真器显著降低了计算成本,使其特别适用于涉及复杂物理模型的大规模模拟任务。