The problem of symbolic regression (SR) arises in many different applications, such as identifying physical laws or deriving mathematical equations describing the behavior of financial markets from given data. Various methods exist to address the problem of SR, often based on genetic programming. However, these methods are usually quite complicated and require a lot of hyperparameter tuning and computational resources. In this paper, we present our new method ParFam that utilizes parametric families of suitable symbolic functions to translate the discrete symbolic regression problem into a continuous one, resulting in a more straightforward setup compared to current state-of-the-art methods. In combination with a powerful global optimizer, this approach results in an effective method to tackle the problem of SR. Furthermore, it can be easily extended to more advanced algorithms, e.g., by adding a deep neural network to find good-fitting parametric families. We prove the performance of ParFam with extensive numerical experiments based on the common SR benchmark suit SRBench, showing that we achieve state-of-the-art results. Our code and results can be found at https://github.com/Philipp238/parfam .
翻译:符号回归(SR)问题广泛出现在诸多应用场景中,例如从给定数据中识别物理定律或推导描述金融市场行为的数学方程。现有多种方法用于解决SR问题,通常基于遗传编程。然而,这些方法往往相当复杂,需要大量超参数调优和计算资源。本文提出一种新方法ParFam,该方法利用合适的符号函数参数族将离散的符号回归问题转化为连续问题,与当前最先进方法相比,其设置更为直接。结合强大的全局优化器,该方法成为解决SR问题的有效途径。此外,该方法易于扩展至更高级的算法,例如通过添加深度神经网络来寻找拟合效果优越的参数族。我们基于通用SR基准套件SRBench开展大量数值实验,验证了ParFam的性能,证明了其达到了最先进的结果。代码与结果详见https://github.com/Philipp238/parfam。