This paper presents an overview of scientific modeling and discusses the complementary strengths and weaknesses of ML methods for scientific modeling in comparison to process-based models. It also provides an introduction to the current state of research in the emerging field of scientific knowledge-guided machine learning (KGML) that aims to use both scientific knowledge and data in ML frameworks to achieve better generalizability, scientific consistency, and explainability of results. We discuss different facets of KGML research in terms of the type of scientific knowledge used, the form of knowledge-ML integration explored, and the method for incorporating scientific knowledge in ML. We also discuss some of the common categories of use cases in environmental sciences where KGML methods are being developed, using illustrative examples in each category.
翻译:本文概述了科学建模,并讨论了机器学习方法在科学建模中相较于过程模型的互补优势与不足。同时,本文介绍了科学知识引导的机器学习(KGML)这一新兴领域的研究现状,该领域旨在将科学知识与数据共同应用于机器学习框架中,以实现更好的泛化能力、科学一致性和可解释性。我们从所用科学知识的类型、知识-机器学习融合的形式以及将科学知识融入机器学习的方法等多个维度,探讨了KGML研究的不同方面。此外,本文还讨论了环境科学领域中正在开发KGML方法的一些常见用例类别,并辅以每类中的典型案例加以说明。