There exist endless examples of dynamical systems with vast available data and unsatisfying mathematical descriptions. Sparse regression applied to symbolic libraries has quickly emerged as a powerful tool for learning governing equations directly from data; these learned equations balance quantitative accuracy with qualitative simplicity and human interpretability. Here, I present a general purpose, model agnostic sparse regression algorithm that extends a recently proposed exhaustive search leveraging iterative Singular Value Decompositions (SVD). This accelerated scheme, Scalable Pruning for Rapid Identification of Null vecTors (SPRINT), uses bisection with analytic bounds to quickly identify optimal rank-1 modifications to null vectors. It is intended to maintain sensitivity to small coefficients and be of reasonable computational cost for large symbolic libraries. A calculation that would take the age of the universe with an exhaustive search but can be achieved in a day with SPRINT.
翻译:在动力学系统中,存在无数实例:数据量庞大,但数学描述却差强人意。将稀疏回归应用于符号库,已迅速成为一种直接从数据中学习控制方程的强大工具;这些学习到的方程在定量精度与定性简洁性及人类可解释性之间取得了平衡。本文提出了一种通用、模型无关的稀疏回归算法,该算法基于近期提出的利用迭代奇异值分解(SVD)的穷举搜索方法进行扩展。这一加速方案——可扩展剪枝用于快速识别零向量(SPRINT),采用基于解析界的二分法,快速识别零向量的最优秩-1修正。其设计目标在于:保持对小系数的敏感性,并确保大规模符号库的计算成本合理。一个需要穷举搜索耗时相当于宇宙年龄才能完成的计算,利用SPRINT可在一天内实现。