TwinLab: a framework for data-efficient training of non-intrusive reduced-order models for digital twins

Purpose: Simulation-based digital twins represent an effort to provide high-accuracy real-time insights into operational physical processes. However, the computation time of many multi-physical simulation models is far from real-time. It might even exceed sensible time frames to produce sufficient data for training data-driven reduced-order models. This study presents TwinLab, a framework for data-efficient, yet accurate training of neural-ODE type reduced-order models with only two data sets. Design/methodology/approach: Correlations between test errors of reduced-order models and distinct features of corresponding training data are investigated. Having found the single best data sets for training, a second data set is sought with the help of similarity and error measures to enrich the training process effectively. Findings: Adding a suitable second training data set in the training process reduces the test error by up to 49% compared to the best base reduced-order model trained only with one data set. Such a second training data set should at least yield a good reduced-order model on its own and exhibit higher levels of dissimilarity to the base training data set regarding the respective excitation signal. Moreover, the base reduced-order model should have elevated test errors on the second data set. The relative error of the time series ranges from 0.18% to 0.49%. Prediction speed-ups of up to a factor of 36,000 are observed. Originality: The proposed computational framework facilitates the automated, data-efficient extraction of non-intrusive reduced-order models for digital twins from existing simulation models, independent of the simulation software.

翻译：目的：基于仿真的数字孪生旨在为运行中的物理过程提供高精度实时洞察。然而，许多多物理场仿真模型的计算时间远未达到实时要求，甚至可能超出为训练数据驱动的降阶模型生成足够数据的合理时间范围。本研究提出TwinLab框架，该框架仅需两个数据集即可实现神经ODE型降阶模型的数据高效且精确训练。设计/方法/途径：本研究探究了降阶模型测试误差与相应训练数据特征之间的关联性。在确定最佳单训练数据集后，借助相似性与误差度量寻找第二个数据集，以有效增强训练过程。研究发现：与仅使用单一数据集训练的最佳基准降阶模型相比，在训练过程中加入合适的第二训练数据集可使测试误差降低高达49%。此类第二训练数据集应至少能独立训练出性能良好的降阶模型，且在激励信号方面与基准训练数据集具有较高的差异性。此外，基准降阶模型在第二数据集上应表现出较高的测试误差。时间序列的相对误差范围在0.18%至0.49%之间，预测速度最高可提升36,000倍。原创性：所提出的计算框架能够独立于仿真软件，从现有仿真模型中自动化、数据高效地提取适用于数字孪生的非侵入式降阶模型。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日