Symbolic Regression is the study of algorithms that automate the search for analytic expressions that fit data. While recent advances in deep learning have generated renewed interest in such approaches, efforts have not been focused on physics, where we have important additional constraints due to the units associated with our data. Here we present $\Phi$-SO, a Physical Symbolic Optimization framework for recovering analytical symbolic expressions from physics data using deep reinforcement learning techniques by learning units constraints. Our system is built, from the ground up, to propose solutions where the physical units are consistent by construction. This is useful not only in eliminating physically impossible solutions, but because it restricts enormously the freedom of the equation generator, thus vastly improving performance. The algorithm can be used to fit noiseless data, which can be useful for instance when attempting to derive an analytical property of a physical model, and it can also be used to obtain analytical approximations to noisy data. We showcase our machinery on a panel of examples from astrophysics.
翻译:符号回归研究的是自动化搜索数据拟合解析表达式的算法。尽管深度学习的最新进展重新激发了人们对这类方法的兴趣,但这些努力并未聚焦于物理学领域——在该领域中,由于数据关联的单位制,我们面临重要的额外约束。本文提出Φ-SO(物理符号优化框架),该框架利用深度强化学习技术,通过学习单位约束从物理数据中恢复解析符号表达式。我们的系统从底层设计就确保所提解的物理单位具有构造性一致性。这不仅有助于排除物理不可行的解,更关键的是极大限制了方程生成器的自由度,从而显著提升性能。该算法既可拟合无噪声数据(例如在推导物理模型解析性质时),也可用于获取含噪声数据的解析近似。我们通过天体物理学领域的一系列案例展示了该方法的效能。