并行计算环境中校准任务序列化实验设计的性能分析 (Performance Analysis of Sequential Experimental Design for Calibration in Parallel Computing Environments)

The unknown parameters of simulation models often need to be calibrated using observed data. When simulation models are expensive, calibration is usually carried out with an emulator. The effectiveness of the calibration process can be significantly improved by using a sequential selection of parameters to build an emulator. The expansion of parallel computing environments--from multicore personal computers to many-node servers to large-scale cloud computing environments--can lead to further calibration efficiency gains by allowing for the evaluation of the simulation model at a batch of parameters in parallel in a sequential design. However, understanding the performance implications of different sequential approaches in parallel computing environments introduces new complexities since the rate of the speed-up is affected by many factors, such as the run time of a simulation model and the variability in the run time. This work proposes a new performance model to understand and benchmark the performance of different sequential procedures for the calibration of simulation models in parallel environments. We provide metrics and a suite of techniques for visualizing the numerical experiment results and demonstrate these with a novel sequential procedure. The proposed performance model, as well as the new sequential procedure and other state-of-art techniques, are implemented in the open-source Python software package Parallel Uncertainty Quantification (PUQ), which allows users to run a simulation model in parallel.

翻译：仿真模型的未知参数通常需要利用观测数据进行校准。当仿真模型计算成本高昂时，校准过程通常借助代理模型完成。通过序列化选择参数构建代理模型，可显著提升校准过程的有效性。并行计算环境的发展——从多核个人计算机到多节点服务器，再到大规模云计算环境——使得在序列化设计中能够并行评估一批参数，从而进一步提升校准效率。然而，理解并行计算环境中不同序列化方法的性能影响带来了新的复杂性，因为加速比受多种因素影响，例如仿真模型的运行时间及其变异性。本研究提出一种新的性能模型，用以理解和评估并行环境下仿真模型校准的不同序列化流程的性能表现。我们提供了一套度量指标与可视化数值实验结果的综合方法，并通过一种新型序列化流程进行演示。所提出的性能模型、新型序列化流程及其他前沿技术均已集成于开源Python软件包Parallel Uncertainty Quantification（PUQ）中，该工具支持用户在并行环境中运行仿真模型。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日