A state-space model is a statistical framework for inferring latent states from observed time-series data. However, inference with nonlinear and high-dimensional state-space models remains challenging. To this end, an approach based on diffusion models-a powerful class of deep generative models-has been developed, known as Score-based Data Assimilation (SDA). However, SDA cannot be directly applied when the latent-state transition depends on unknown parameters that must be inferred jointly with the latent states. To overcome this limitation, we propose a framework that enables SDA to handle latent states with unknown parameters. A key feature of the proposed method is the incorporation of the self-organization technique, which has been used in classical state-space modeling for the joint estimation of latent states and parameters. By integrating this classical technique into modern SDA, our method enables joint inference of latent states and unknown parameters while maintaining the high training efficiency of SDA. The effectiveness of the proposed approach is validated through numerical experiments on dynamical systems arising in neuroscience and atmospheric science. In addition, its scalability is demonstrated using a high-dimensional Kolmogorov flow, with the data dimension on the order of several hundred thousand.
翻译:状态空间模型是一种从观测时间序列数据中推断潜在状态的统计框架。然而,对非线性高维状态空间模型进行推断仍具挑战性。为此,研究者开发了基于扩散模型(一类强大的深度生成模型)的方法,即基于评分的数据同化。但该方法无法直接应用于潜在状态转移依赖于需与潜在状态联合推断的未知参数的情形。为克服这一局限,我们提出一种框架,使SDA能够处理含未知参数的潜在状态。该方法的关键特征在于引入自组织技术,该技术已在经典状态空间建模中用于潜在状态与参数的联合估计。通过将这一经典技术融入现代SDA,我们的方法可在保持SDA高训练效率的同时实现潜在状态与未知参数的联合推断。通过神经科学与大气科学中动力系统的数值实验,验证了所提方法的有效性。此外,利用数据维度达数十万量级的高维科尔莫戈罗夫流,展示了其可扩展性。