Hybrid Cold-Start Recommender System for Closure Model Selection in Multiphase Flow Simulations

Selecting appropriate physical models is a critical yet difficult step in many areas of computational science and engineering. In multiphase Computational Fluid Dynamics (CFD), practitioners must choose among numerous closure model combinations whose performance varies strongly across flow conditions. Sub-optimal choices can lead to inaccurate predictions, simulation failures, and wasted computational resources, making model selection a prime candidate for data-driven decision support. This work formulates closure model selection as a cold-start recommender system problem in a high-cost scientific domain. We propose a hybrid recommendation framework that combines (i) metadata-driven case similarity and (ii) collaborative inference via matrix completion. The approach enables case-specific model recommendations for entirely new CFD cases using their descriptive features, while leveraging historical simulation results from similar cases. The methodology is evaluated on 13,600 simulations across 136 validation cases and 100 model combinations. A nested cross-validation protocol with experiment-level holdout is employed to rigorously assess generalisation to unseen flow scenarios under varying levels of data sparsity. Recommendation quality is measured using ranking-based metrics and a domain-specific regret measure capturing performance loss relative to the per-case optimum. Results show that the proposed hybrid recommender consistently outperforms popularity-based and expert-designed reference models and reduces regret across the investigated sparsities. These findings demonstrate that recommender system methodology can effectively support complex scientific decision-making tasks characterised by expensive evaluations, structured metadata, and limited prior observations.

翻译：在计算科学与工程的众多领域，选择恰当的物理模型既是关键步骤，也是困难任务。在多相计算流体力学（CFD）中，从业者必须在众多封闭模型组合中做出选择，而这些组合在不同流动条件下的表现差异显著。次优选择可能导致预测不准确、模拟失败以及计算资源浪费，因此模型选择成为数据驱动决策支持的首要候选领域。本研究将封闭模型选择形式化为高成本科学领域中的冷启动推荐系统问题。我们提出一种混合推荐框架，该框架整合了（i）基于元数据驱动的案例相似性以及（ii）通过矩阵补全实现的协同推理。该方法能够利用全新CFD案例的描述性特征，为其提供针对特定案例的模型推荐，同时借鉴相似案例的历史模拟结果。该方法在136个验证案例的13,600次模拟以及100个模型组合上进行了评估。采用嵌套交叉验证协议（含实验级留出法），以严格评估在不同稀疏度数据下对未见流动场景的泛化能力。推荐质量使用基于排名的指标和领域特定的遗憾度量来衡量，该遗憾度量捕捉了相对于每个案例最优性能的损失。结果表明，所提出的混合推荐系统在考察的所有稀疏度下均持续优于基于流行度和专家设计的参考模型，并减少了遗憾。这些发现表明，推荐系统方法能够有效支持以高代价评估、结构化的元数据和有限的先验观测为特征的复杂科学决策任务。