In the field of phase change phenomena, the lack of accessible and diverse datasets suitable for machine learning (ML) training poses a significant challenge. Existing experimental datasets are often restricted, with limited availability and sparse ground truth data, impeding our understanding of this complex multi-physics phenomena. To bridge this gap, we present the BubbleML Dataset(https://github.com/HPCForge/BubbleML) which leverages physics-driven simulations to provide accurate ground truth information for various boiling scenarios, encompassing nucleate pool boiling, flow boiling, and sub-cooled boiling. This extensive dataset covers a wide range of parameters, including varying gravity conditions, flow rates, sub-cooling levels, and wall superheat, comprising 51 simulations. BubbleML is validated against experimental observations and trends, establishing it as an invaluable resource for ML research. Furthermore, we showcase its potential to facilitate exploration of diverse downstream tasks by introducing two benchmarks: (a) optical flow analysis to capture bubble dynamics, and (b) operator networks for learning temperature dynamics. The BubbleML dataset and its benchmarks serve as a catalyst for advancements in ML-driven research on multi-physics phase change phenomena, enabling the development and comparison of state-of-the-art techniques and models.
翻译:在相变现象研究领域,缺乏适合机器学习训练的易获取且多样化的数据集是一大难题。现有的实验数据集通常受到限制,可用性有限且真实数据稀疏,阻碍了我们对这一复杂多物理场现象的理解。为填补这一空白,我们提出了BubbleML数据集(https://github.com/HPCForge/BubbleML),该数据集利用物理驱动模拟,为多种沸腾场景(包括核态池沸腾、流动沸腾和过冷沸腾)提供精确的真实数据。这一广泛的数据集涵盖了多种参数,包括不同重力条件、流速、过冷度和壁面过热度,共包含51个模拟案例。BubbleML经过实验观测和趋势的验证,成为机器学习研究的宝贵资源。此外,我们通过引入两个基准任务展示了其促进多样化下游任务探索的潜力:(a)用于捕捉气泡动态的光流分析,以及(b)用于学习温度动态的算子网络。BubbleML数据集及其基准将推动多物理场相变现象中机器学习驱动研究的进步,助力最先进技术和模型的发展与比较。