Recent advancements in Machine Learning (ML) have substantially improved its predictive and computational abilities, offering promising opportunities for surrogate modeling in scientific applications. By accurately approximating complex functions with low computational cost, ML-based surrogates can accelerate scientific applications by replacing computationally intensive components with faster model inference. However, integrating ML models into these applications remains a significant challenge, hindering the widespread adoption of ML surrogates as an approximation technique in modern scientific computing. We propose an easy-to-use directive-based programming model that enables developers to seamlessly describe the use of ML models in scientific applications. The runtime support, as instructed by the programming model, performs data assimilation using the original algorithm and can replace the algorithm with model inference. Our evaluation across five benchmarks, testing over 5000 ML models, shows up to 83.6x speed improvements with minimal accuracy loss (as low as 0.01 RMSE).
翻译:近年来,机器学习(ML)在预测和计算能力方面取得了显著进步,为科学应用中的代理建模提供了广阔前景。通过以较低计算成本精确逼近复杂函数,基于ML的代理模型能够以更快的模型推断替代计算密集型组件,从而加速科学应用。然而,将ML模型集成到这些应用中仍面临重大挑战,这阻碍了ML代理模型作为近似技术在现代科学计算中的广泛应用。我们提出了一种易于使用的基于指令的编程模型,使开发者能够无缝描述科学应用中ML模型的使用方式。该编程模型指导下的运行时支持,可利用原始算法执行数据同化,并能用模型推断替代算法。我们在五个基准测试中评估了超过5000个ML模型,结果显示最高可实现83.6倍的加速,且精度损失极小(RMSE最低至0.01)。