In the field of phase change phenomena, the lack of accessible and diverse datasets suitable for machine learning (ML) training poses a significant challenge. Existing experimental datasets are often restricted, with limited availability and sparse ground truth data, impeding our understanding of this complex multiphysics phenomena. To bridge this gap, we present the BubbleML Dataset \footnote{\label{git_dataset}\url{https://github.com/HPCForge/BubbleML}} which leverages physics-driven simulations to provide accurate ground truth information for various boiling scenarios, encompassing nucleate pool boiling, flow boiling, and sub-cooled boiling. This extensive dataset covers a wide range of parameters, including varying gravity conditions, flow rates, sub-cooling levels, and wall superheat, comprising 79 simulations. BubbleML is validated against experimental observations and trends, establishing it as an invaluable resource for ML research. Furthermore, we showcase its potential to facilitate exploration of diverse downstream tasks by introducing two benchmarks: (a) optical flow analysis to capture bubble dynamics, and (b) operator networks for learning temperature dynamics. The BubbleML dataset and its benchmarks serve as a catalyst for advancements in ML-driven research on multiphysics phase change phenomena, enabling the development and comparison of state-of-the-art techniques and models.
翻译:在相变现象研究领域,缺乏适用于机器学习训练的可获取且多样化的数据集是一项重大挑战。现有实验数据集往往受限于有限的可获取性和稀疏的真实数据,阻碍了我们对这一复杂多物理现象的理解。为弥合这一鸿沟,我们提出了BubbleML数据集\footnote{\label{git_dataset}\url{https://github.com/HPCForge/BubbleML}},该数据集利用物理驱动模拟为多种沸腾场景(包括核态池沸腾、流动沸腾和过冷沸腾)提供精确的真实数据。这一广泛数据集涵盖了包括不同重力条件、流速、过冷度和壁面过热度在内的多种参数,共计包含79个模拟案例。BubbleML经过实验观测和趋势验证,成为机器学习研究的宝贵资源。此外,我们通过引入两个基准任务展示其促进多样化下游任务探索的潜力:(a)用于捕捉气泡动力学的光流分析,以及(b)用于学习温度动力学的算子网络。BubbleML数据集及其基准将作为催化剂,推动多物理场相变现象中机器学习驱动研究的发展,促进先进技术与模型的开发与比较。