Learning unknown stochastic differential equations (SDEs) from observed data is a significant and challenging task with applications in various fields. Current approaches often use neural networks to represent drift and diffusion functions, and construct likelihood-based loss by approximating the transition density to train these networks. However, these methods often rely on one-step stochastic numerical schemes, necessitating data with sufficiently high time resolution. In this paper, we introduce novel approximations to the transition density of the parameterized SDE: a Gaussian density approximation inspired by the random perturbation theory of dynamical systems, and its extension, the dynamical Gaussian mixture approximation (DynGMA). Benefiting from the robust density approximation, our method exhibits superior accuracy compared to baseline methods in learning the fully unknown drift and diffusion functions and computing the invariant distribution from trajectory data. And it is capable of handling trajectory data with low time resolution and variable, even uncontrollable, time step sizes, such as data generated from Gillespie's stochastic simulations. We then conduct several experiments across various scenarios to verify the advantages and robustness of the proposed method.
翻译:从观测数据中学习未知的随机微分方程(SDE)是一项重要且具有挑战性的任务,在多个领域具有广泛应用。当前方法通常使用神经网络表示漂移函数和扩散函数,并通过近似转移密度构造基于似然的损失函数来训练这些网络。然而,这些方法往往依赖单步随机数值格式,要求数据具备足够高的时间分辨率。本文提出了参数化SDE转移密度的新型近似方法:基于动力系统随机扰动理论的高斯密度近似,及其扩展形式——动态高斯混合近似(DynGMA)。得益于稳健的密度近似,本方法在学习完全未知的漂移和扩散函数以及从轨迹数据计算不变分布方面,相较于基线方法表现出更优的精度。该方法能够处理低时间分辨率、可变甚至不可控时间步长的轨迹数据(如Gillespie随机模拟生成的数据)。我们随后在不同场景下开展多项实验,验证了所提方法的优越性与稳健性。