Traditional data-driven methods, effective for deterministic systems or stochastic differential equations (SDEs) with Gaussian noise, fail to handle the discontinuous sample paths and heavy-tailed fluctuations characteristic of Lévy processes, particularly when the noise is state-dependent. To bridge this gap, we establish nonlocal Kramers-Moyal formulas, rigorously generalizing the classical Kramers-Moyal relations to SDEs with multiplicative Lévy noise. These formulas provide a direct link between short-time transition probability densities (or sample path statistics) and the underlying SDE coefficients: the drift vector, diffusion matrix, Lévy jump measure kernel, and Lévy noise intensity functions. Leveraging these theoretical foundations, we develop novel data-driven algorithms capable of simultaneously identifying all governing components from data and establish convergence results and error analysis for the algorithms. We validate the framework through extensive numerical experiments on prototypical systems. This work provides a principled and practical toolbox for discovering interpretable SDE models governing complex systems influenced by discontinuous, heavy-tailed, state-dependent fluctuations, with broad applicability in climate science, neuroscience, epidemiology, finance, and biological physics.
翻译:传统数据驱动方法虽对确定性系统或含高斯噪声的随机微分方程有效,却无法处理Lévy过程特有的不连续样本路径与重尾波动特征,尤其在噪声具有状态依赖性时。为弥补这一空白,我们建立了非局部Kramers-Moyal公式,将经典Kramers-Moyal关系严格推广至含乘性Lévy噪声的随机微分方程。这些公式建立了短时转移概率密度(或样本路径统计量)与底层随机微分方程系数之间的直接联系:漂移向量、扩散矩阵、Lévy跳跃测度核及Lévy噪声强度函数。基于此理论框架,我们开发了新型数据驱动算法,能够从数据中同步识别所有控制分量,并建立了算法的收敛性结果与误差分析。通过对典型系统的大量数值实验验证了该框架的有效性。本研究为发现受不连续、重尾、状态依赖性波动影响的复杂系统之可解释随机微分方程模型,提供了原理严谨且实用的工具箱,在气候科学、神经科学、流行病学、金融与生物物理学领域具有广泛适用性。