Efficient Learning of Accurate Surrogates for Simulations of Complex Systems

Machine learning methods are increasingly used to build computationally inexpensive surrogates for complex physical models. The predictive capability of these surrogates suffers when data are noisy, sparse, or time-dependent. As we are interested in finding a surrogate that provides valid predictions of any potential future model evaluations, we introduce an online learning method empowered by optimizer-driven sampling. The method has two advantages over current approaches. First, it ensures that all turning points on the model response surface are included in the training data. Second, after any new model evaluations, surrogates are tested and "retrained" (updated) if the "score" drops below a validity threshold. Tests on benchmark functions reveal that optimizer-directed sampling generally outperforms traditional sampling methods in terms of accuracy around local extrema, even when the scoring metric favors overall accuracy. We apply our method to simulations of nuclear matter to demonstrate that highly accurate surrogates for the nuclear equation of state can be reliably auto-generated from expensive calculations using a few model evaluations.

翻译：机器学习方法日益广泛地用于构建复杂物理模型的计算廉价代理模型。当数据存在噪声、稀疏性或时间依赖性时，这些代理模型的预测能力会显著下降。针对需要为任何潜在的未来模型评估提供有效预测的代理模型问题，我们引入一种由优化器驱动采样的在线学习方法。该方法相较于现有方法具有两大优势：第一，确保模型响应曲面上的所有转折点均被纳入训练数据；第二，在新模型评估后，若代理模型的"评分"低于有效性阈值，则对其进行测试与"再训练"（更新）。对基准函数的测试表明，即使在评分指标偏重全局精度的条件下，优化器指导采样在局部极值区域精度方面普遍优于传统采样方法。我们将该方法应用于核物质仿真，论证了仅需少量模型评估即可通过昂贵计算可靠地自动生成核物质状态方程的高精度代理模型。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/