This work develops an active learning framework to intelligently enrich data-driven reduced-order models (ROMs) of parametric dynamical systems, which can serve as the foundation of virtual assets in a digital twin. Data-driven ROMs are explainable, computationally efficient scientific machine learning models that aim to preserve the underlying physics of complex dynamical simulations. Since the quality of data-driven ROMs is sensitive to the quality of the limited training data, we seek to identify training parameters for which using the associated training data results in the best possible parametric ROM. Our approach uses the operator inference methodology, a regression-based strategy which can be tailored to particular parametric structure for a large class of problems. We establish a probabilistic version of parametric operator inference, casting the learning problem as a Bayesian linear regression. Prediction uncertainties stemming from the resulting probabilistic ROM solutions are used to design a sequential adaptive sampling scheme to select new training parameter vectors that promote ROM stability and accuracy globally in the parameter domain. We conduct numerical experiments for several nonlinear parametric systems of partial differential equations and compare the results to ROMs trained on random parameter samples. The results demonstrate that the proposed adaptive sampling strategy consistently yields more stable and accurate ROMs than random sampling does under the same computational budget.
翻译:本研究提出了一种主动学习框架,用于智能增强参数化动力系统的数据驱动降阶模型,该模型可作为数字孪生中虚拟资产的基础。数据驱动降阶模型是一种可解释、计算高效的科学机器学习模型,旨在保持复杂动力模拟的底层物理特性。由于数据驱动降阶模型的质量对有限训练数据的质量较为敏感,我们致力于识别能使关联训练数据产生最优参数化降阶模型的训练参数。本方法采用算子推断方法——一种基于回归的策略,可针对大类问题适配特定的参数化结构。我们建立了参数化算子推断的概率版本,将学习问题转化为贝叶斯线性回归。利用所得概率降阶模型解产生的预测不确定性,设计了一种序列自适应采样方案,以选择能够提升降阶模型在参数域内全局稳定性与精度的新训练参数向量。通过对多个非线性参数化偏微分方程系统进行数值实验,并将结果与基于随机参数样本训练的降阶模型进行比较。结果表明,在相同计算资源下,所提出的自适应采样策略相较于随机采样能持续产生更稳定、更精确的降阶模型。