Conditional Score-Based Modeling of Effective Langevin Dynamics

Stochastic reduced-order models are widely used to represent the effective dynamics of complex systems, but estimating their drift and diffusion coefficients from data remains challenging. Standard approaches often rely on short-time trajectory increments, state-space partitioning, or repeated simulation of candidate models, which become unreliable or computationally expensive for high-dimensional systems, coarse temporal sampling, or unevenly sampled data. We introduce a data-driven calibration method based on a novel relationship between the coefficients of a stochastic reduced model and the conditional score of the finite-time transition density, defined as the gradient of the logarithm of the transition density with respect to the initial state. The resulting identity expresses derivatives of lagged correlation functions as stationary expectations over observed lagged pairs involving this conditional score and the unknown model coefficients. This formulation allows the drift and diffusion structure to be constrained directly from finite-lag statistics, without differentiating trajectories, partitioning state space, or repeatedly integrating candidate reduced models during calibration, yielding a least-squares fitting problem over stationary lagged pairs. We validate the approach on three systems of increasing complexity: an analytically tractable Cox--Ingersoll--Ross diffusion, a two-dimensional nonequilibrium diffusion with affine multiplicative noise, and a periodic soft-spin stochastic Landau--Lifshitz chain. Across these tests, the inferred models preserve the invariant statistics while reproducing finite-lag dynamical correlations. The framework provides a scalable route for learning stochastic reduced-order models from data that reproduce prescribed statistical and dynamical properties.

翻译：随机降阶模型广泛用于表示复杂系统的有效动力学，但从数据中估计其漂移和扩散系数仍具挑战性。标准方法通常依赖短时轨迹增量、状态空间划分或候选模型的重复模拟，对于高维系统、粗时间采样或非均匀采样数据，这些方法会变得不可靠或计算成本高昂。我们提出一种基于随机降阶模型系数与有限时间转移密度条件分数（定义为转移密度关于初始状态的对数梯度）之间新型关系的数据驱动校准方法。所得恒等式将滞后相关函数的导数表示为涉及该条件分数和未知模型系数的观测滞后对的平稳期望。该公式允许直接从有限滞后统计量约束漂移和扩散结构，在标定过程中无需对轨迹求导、划分状态空间或重复积分候选降阶模型，从而形成关于平稳滞后对的最小二乘拟合问题。我们在三个复杂度递增的系统上验证了该方法：解析可解的Cox–Ingersoll–Ross扩散、具有仿射乘性噪声的二维非平衡扩散，以及周期软自旋随机朗道–利夫希茨链。在这些测试中，推断模型在复现有限滞后动力学相关性的同时保持了不变统计量。该框架为从数据中学习再现规定统计和动力学性质的随机降阶模型提供了可扩展的途径。