Learning unknown stochastic differential equations (SDEs) from observed data is a significant and challenging task with applications in various fields. Current approaches often use neural networks to represent drift and diffusion functions, and construct likelihood-based loss by approximating the transition density to train these networks. However, these methods often rely on one-step stochastic numerical schemes, necessitating data with sufficiently high time resolution. In this paper, we introduce novel approximations to the transition density of the parameterized SDE: a Gaussian density approximation inspired by the random perturbation theory of dynamical systems, and its extension, the dynamical Gaussian mixture approximation (DynGMA). Benefiting from the robust density approximation, our method exhibits superior accuracy compared to baseline methods in learning the fully unknown drift and diffusion functions and computing the invariant distribution from trajectory data. And it is capable of handling trajectory data with low time resolution and variable, even uncontrollable, time step sizes, such as data generated from Gillespie's stochastic simulations. We then conduct several experiments across various scenarios to verify the advantages and robustness of the proposed method.
翻译:从观测数据中学习未知的随机微分方程(SDE)是一项重要且具有挑战性的任务,在多个领域都有应用。当前的方法通常使用神经网络来表示漂移和扩散函数,并通过近似转移密度来构建基于似然的损失函数以训练这些网络。然而,这些方法通常依赖于一步随机数值格式,需要时间分辨率足够高的数据。本文中,我们针对参数化SDE的转移密度引入了两种新颖的近似方法:一种受动力系统随机扰动理论启发的高斯密度近似,及其扩展形式——动态高斯混合近似(DynGMA)。得益于稳健的密度近似,我们的方法在从轨迹数据中学习完全未知的漂移和扩散函数以及计算不变分布方面,相比基线方法表现出更高的准确性。此外,该方法能够处理时间分辨率较低、时间步长可变甚至不可控的轨迹数据,例如由Gillespie随机模拟生成的数据。随后,我们在多种场景下进行了多项实验,以验证所提方法的优势与稳健性。