Purpose: Simulation-based digital twins represent an effort to provide high-accuracy real-time insights into operational physical processes. However, the computation time of many multi-physical simulation models is far from real-time. It might even exceed sensible time frames to produce sufficient data for training data-driven reduced-order models. This study presents TwinLab, a framework for data-efficient, yet accurate training of neural-ODE type reduced-order models with only two data sets. Design/methodology/approach: Correlations between test errors of reduced-order models and distinct features of corresponding training data are investigated. Having found the single best data sets for training, a second data set is sought with the help of similarity and error measures to enrich the training process effectively. Findings: Adding a suitable second training data set in the training process reduces the test error by up to 49% compared to the best base reduced-order model trained only with one data set. Such a second training data set should at least yield a good reduced-order model on its own and exhibit higher levels of dissimilarity to the base training data set regarding the respective excitation signal. Moreover, the base reduced-order model should have elevated test errors on the second data set. The relative error of the time series ranges from 0.18% to 0.49%. Prediction speed-ups of up to a factor of 36,000 are observed. Originality: The proposed computational framework facilitates the automated, data-efficient extraction of non-intrusive reduced-order models for digital twins from existing simulation models, independent of the simulation software.
翻译:目的:基于仿真的数字孪生旨在为运行中的物理过程提供高精度实时洞察。然而,许多多物理场仿真模型的计算时间远未达到实时要求,甚至可能超出为训练数据驱动的降阶模型生成足够数据的合理时间范围。本研究提出TwinLab框架,该框架仅需两个数据集即可实现神经ODE型降阶模型的数据高效且精确训练。设计/方法/途径:本研究探究了降阶模型测试误差与相应训练数据特征之间的关联性。在确定最佳单训练数据集后,借助相似性与误差度量寻找第二个数据集,以有效增强训练过程。研究发现:与仅使用单一数据集训练的最佳基准降阶模型相比,在训练过程中加入合适的第二训练数据集可使测试误差降低高达49%。此类第二训练数据集应至少能独立训练出性能良好的降阶模型,且在激励信号方面与基准训练数据集具有较高的差异性。此外,基准降阶模型在第二数据集上应表现出较高的测试误差。时间序列的相对误差范围在0.18%至0.49%之间,预测速度最高可提升36,000倍。原创性:所提出的计算框架能够独立于仿真软件,从现有仿真模型中自动化、数据高效地提取适用于数字孪生的非侵入式降阶模型。