Stochastic differential equations (SDEs) are a fundamental tool for modelling dynamic processes, including gene regulatory networks (GRNs), contaminant transport, financial markets, and image generation. However, learning the underlying SDE from observational data is a challenging task, especially when individual trajectories are not observable. Motivated by burgeoning research in single-cell datasets, we present the first comprehensive approach for jointly estimating the drift and diffusion of an SDE from its temporal marginals. Assuming linear drift and additive diffusion, we prove that these parameters are identifiable from marginals if and only if the initial distribution is not invariant to a class of generalized rotations, a condition that is satisfied by most distributions. We further prove that the causal graph of any SDE with additive diffusion can be recovered from the SDE parameters. To complement this theory, we adapt entropy-regularized optimal transport to handle anisotropic diffusion, and introduce APPEX (Alternating Projection Parameter Estimation from $X_0$), an iterative algorithm designed to estimate the drift, diffusion, and causal graph of an additive noise SDE, solely from temporal marginals. We show that each of these steps are asymptotically optimal with respect to the Kullback-Leibler divergence, and demonstrate APPEX's effectiveness on simulated data from linear additive noise SDEs.
翻译:随机微分方程是建模动态过程的基本工具,涵盖基因调控网络、污染物输运、金融市场及图像生成等领域。然而,从观测数据中学习底层随机微分方程是一项具有挑战性的任务,尤其是在无法观测个体轨迹的情况下。受单细胞数据集研究蓬勃发展的启发,我们提出了首个从时间边际分布联合估计随机微分方程漂移项与扩散项的综合方法。在线性漂移与加性扩散的假设下,我们证明当且仅当初始分布对一类广义旋转变换不具不变性时,这些参数可从边际分布中识别——该条件在大多数分布中均成立。我们进一步证明任何加性扩散随机微分方程的因果图均可从其参数中恢复。为完善该理论,我们改进熵正则化最优传输方法以处理各向异性扩散,并提出了APPEX算法(基于$X_0$的交替投影参数估计)——一种专门从时间边际分布中估计加性噪声随机微分方程的漂移项、扩散项及因果图的迭代算法。我们证明这些步骤在Kullback-Leibler散度意义下均具有渐近最优性,并通过线性加性噪声随机微分方程的模拟数据验证了APPEX算法的有效性。