Computational modeling is a key resource to gather insight into physical systems in modern scientific research and engineering. While access to large amount of data has fueled the use of Machine Learning (ML) to recover physical models from experiments and increase the accuracy of physical simulations, purely data-driven models have limited generalization and interpretability. To overcome these limitations, we propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models starting from experimental data. Since these models consist of mathematical expressions, they are interpretable and amenable to analysis, and the use of a natural, general-purpose discrete mathematical language for physics favors generalization with limited input data. Importantly, DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems. Further, we show that DEC allows to implement a strongly-typed SR procedure that guarantees the mathematical consistency of the recovered models and reduces the search space of symbolic expressions. Finally, we prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data: Poisson equation, the Euler's Elastica and the equations of Linear Elasticity. Thanks to their general-purpose nature, the methods developed in this paper may be applied to diverse contexts of physical modeling.
翻译:计算建模是现代科学研究与工程中获取物理系统见解的关键资源。尽管大量数据的获取推动了机器学习在从实验恢复物理模型及提升物理模拟精度方面的应用,但纯数据驱动模型存在泛化能力有限和可解释性不足的问题。为克服这些局限,我们提出了一种融合符号回归与离散外微分的框架,可从实验数据出发自动发现物理模型。由于这些模型由数学表达式构成,它们既具有可解释性又便于分析,同时利用通用的离散数学语言描述物理过程,有助于在有限输入数据下实现良好的泛化能力。重要的是,离散外微分为场论的离散类比提供了基础构件,这超越了当前符号回归在物理问题中的应用水平。此外,我们证明离散外微分可实现强类型符号回归流程,既能保证所恢复模型的数学一致性,又能缩小符号表达式的搜索空间。最后,通过从合成实验数据中重新发现连续介质力学的三个经典模型——泊松方程、欧拉弹性杆和线弹性方程——验证了本方法的有效性。得益于其通用性,本文所发展的方法可应用于多种物理建模场景。