Technological advancements have substantially increased computational power and data availability, enabling the application of powerful machine-learning (ML) techniques across various fields. However, our ability to leverage ML methods for scientific discovery, {\it i.e.} to obtain fundamental and formalized knowledge about natural processes, is still in its infancy. In this review, we explore how the scientific community can increasingly leverage ML techniques to achieve scientific discoveries. We observe that the applicability and opportunity of ML depends strongly on the nature of the problem domain, and whether we have full ({\it e.g.}, turbulence), partial ({\it e.g.}, computational biochemistry), or no ({\it e.g.}, neuroscience) {\it a-priori} knowledge about the governing equations and physical properties of the system. Although challenges remain, principled use of ML is opening up new avenues for fundamental scientific discoveries. Throughout these diverse fields, there is a theme that ML is enabling researchers to embrace complexity in observational data that was previously intractable to classic analysis and numerical investigations.
翻译:技术 advancements 显著提升了计算能力和数据可用性,使得强大的机器学习(ML)技术能够在各个领域得到应用。然而,我们利用机器学习方法进行科学发现(即获取关于自然过程的基础性、形式化知识)的能力仍处于起步阶段。在这篇综述中,我们探讨科学界如何日益借助机器学习技术以实现科学发现。我们观察到,机器学习的适用性和机遇在很大程度上取决于问题领域的性质,以及我们是否对系统的控制方程和物理性质具有完全(例如湍流)、部分(例如计算生物化学)或没有(例如神经科学)的先验知识。尽管挑战依然存在,但有原则地运用机器学习正为基础科学发现开辟新途径。在所有这些不同领域中,有一个共同主题:机器学习正使研究者能够处理观测数据中的复杂性,而这些复杂性此前对于经典分析和数值研究而言是难以解决的。