Autocalibration of the E3SM version 2 atmosphere model using a PCA-based surrogate for spatial fields

Global Climate Model (GCM) tuning (calibration) is a tedious and time-consuming process, with high-dimensional input and output fields. Experts typically tune by iteratively running climate simulations with hand-picked values of tuning parameters. Many, in both the statistical and climate literature, have proposed alternative calibration methods, but most are impractical or difficult to implement. We present a practical, robust and rigorous calibration approach on the atmosphere-only model of the Department of Energy's Energy Exascale Earth System Model (E3SM) version 2. Our approach can be summarized into two main parts: (1) the training of a surrogate that predicts E3SM output in a fraction of the time compared to running E3SM, and (2) gradient-based parameter optimization. To train the surrogate, we generate a set of designed ensemble runs that span our input parameter space and use polynomial chaos expansions on a reduced output space to fit the E3SM output. We use this surrogate in an optimization scheme to identify values of the input parameters for which our model best matches gridded spatial fields of climate observations. To validate our choice of parameters, we run E3SMv2 with the optimal parameter values and compare prediction results to expertly-tuned simulations across 45 different output fields. This flexible, robust, and automated approach is straightforward to implement, and we demonstrate that the resulting model output matches present day climate observations as well or better than the corresponding output from expert tuned parameter values, while considering high-dimensional output and operating in a fraction of the time.

翻译：全球气候模式（GCM）的调参（校准）是一项繁琐耗时的过程，其输入与输出场均为高维数据。专家通常通过以手动选取的参数值迭代运行气候模拟来完成校准。统计与气候学领域的众多文献提出了多种替代性校准方法，但大多缺乏实用性或难以实施。我们针对美国能源部能源百亿亿次地球系统模式（E3SM）第2版的大气单独模式，提出了一种实用、稳健且严谨的校准方法。该方法可归纳为两个核心环节：（1）训练一个代理模型，使其能以远低于运行E3SM的时间预测模式输出；（2）基于梯度的参数优化。为训练代理模型，我们生成一组覆盖输入参数空间的设计集合模拟，并利用降维输出空间上的多项式混沌展开拟合E3SM输出。在优化方案中，我们使用该代理模型确定输入参数值，使模式与气候观测的网格化空间场最优匹配。为验证参数选择效果，我们以最优参数值运行E3SMv2，并将预测结果与专家手动调参的模拟结果在45个不同输出场中进行比较。这一灵活、稳健且自动化的方法易于实现，我们证明其在考虑高维输出的前提下，能以极短时间获得与专家调参结果相当甚至更优的当前气候观测匹配度。