GN-SINDy: Greedy Sampling Neural Network in Sparse Identification of Nonlinear Partial Differential Equations

The sparse identification of nonlinear dynamical systems (SINDy) is a data-driven technique employed for uncovering and representing the fundamental dynamics of intricate systems based on observational data. However, a primary obstacle in the discovery of models for nonlinear partial differential equations (PDEs) lies in addressing the challenges posed by the curse of dimensionality and large datasets. Consequently, the strategic selection of the most informative samples within a given dataset plays a crucial role in reducing computational costs and enhancing the effectiveness of SINDy-based algorithms. To this aim, we employ a greedy sampling approach to the snapshot matrix of a PDE to obtain its valuable samples, which are suitable to train a deep neural network (DNN) in a SINDy framework. SINDy based algorithms often consist of a data collection unit, constructing a dictionary of basis functions, computing the time derivative, and solving a sparse identification problem which ends to regularised least squares minimization. In this paper, we extend the results of a SINDy based deep learning model discovery (DeePyMoD) approach by integrating greedy sampling technique in its data collection unit and new sparsity promoting algorithms in the least squares minimization unit. In this regard we introduce the greedy sampling neural network in sparse identification of nonlinear partial differential equations (GN-SINDy) which blends a greedy sampling method, the DNN, and the SINDy algorithm. In the implementation phase, to show the effectiveness of GN-SINDy, we compare its results with DeePyMoD by using a Python package that is prepared for this purpose on numerous PDE discovery

翻译：非线性动力系统的稀疏辨识（SINDy）是一种基于观测数据揭示并表征复杂系统基本动力学的数据驱动技术。然而，非线性偏微分方程（PDE）模型发现的主要障碍在于应对维度灾难和大数据集带来的挑战。因此，在给定数据集中策略性地选择最具信息量的样本，对于降低计算成本并提升基于SINDy的算法的有效性具有关键作用。为此，我们对PDE的快照矩阵采用贪心采样方法，以获取有价值的样本，这些样本适合在SINDy框架中训练深度神经网络（DNN）。基于SINDy的算法通常包含数据采集单元、基函数字典构建、时间导数计算以及求解稀疏辨识问题（最终转化为正则化最小二乘最小化）等步骤。本文通过将贪心采样技术集成到其数据采集单元，并在最小二乘最小化单元中引入新的稀疏促进算法，扩展了基于SINDy的深度学习模型发现（DeePyMoD）方法的结果。基于此，我们提出了非线性偏微分方程稀疏辨识中的贪心采样神经网络（GN-SINDy），该模型融合了贪心采样方法、DNN和SINDy算法。在实现阶段，为展示GN-SINDy的有效性，我们利用为此目的准备的Python程序包，将其结果与DeePyMoD在多个PDE发现任务上进行对比。