High-performance scientific simulations, important for comprehension of complex systems, encounter computational challenges especially when exploring extensive parameter spaces. There has been an increasing interest in developing deep neural networks (DNNs) as surrogate models capable of accelerating the simulations. However, existing approaches for training these DNN surrogates rely on extensive simulation data which are heuristically selected and generated with expensive computation -- a challenge under-explored in the literature. In this paper, we investigate the potential of incorporating active learning into DNN surrogate training. This allows intelligent and objective selection of training simulations, reducing the need to generate extensive simulation data as well as the dependency of the performance of DNN surrogates on pre-defined training simulations. In the problem context of constructing DNN surrogates for diffusion equations with sources, we examine the efficacy of diversity- and uncertainty-based strategies for selecting training simulations, considering two different DNN architecture. The results set the groundwork for developing the high-performance computing infrastructure for Smart Surrogates that supports on-the-fly generation of simulation data steered by active learning strategies to potentially improve the efficiency of scientific simulations.
翻译:高性能科学模拟对于理解复杂系统至关重要,但在探索广阔参数空间时面临计算挑战。近年来,开发深度神经网络(DNNs)作为加速模拟的代理模型受到日益关注。然而,现有训练这些DNN代理模型的方法依赖于大量启发式选择且计算成本高昂的模拟数据——这一挑战在现有文献中尚未得到充分探讨。本文研究了将主动学习融入DNN代理模型训练的潜力。该方法通过智能客观地选择训练模拟,减少了对大量模拟数据生成的需求,同时降低了DNN代理模型性能对预定义训练模拟的依赖性。在构建带源项扩散方程DNN代理模型的问题背景下,我们考察了基于多样性和不确定性的训练模拟选择策略的有效性,并考虑了两种不同的DNN架构。研究结果为开发智能代理模型的高性能计算基础设施奠定了基础,该设施支持通过主动学习策略动态生成模拟数据,有望提升科学模拟的效率。