A state-space model is a statistical framework for inferring latent states from observed time-series data. However, inference with nonlinear and high-dimensional state-space models remains challenging. To this end, an approach based on diffusion models-a powerful class of deep generative models-has been developed, known as Score-based Data Assimilation (SDA). However, SDA cannot be directly applied when the latent-state transition depends on unknown parameters that must be inferred jointly with the latent states. To overcome this limitation, we propose a framework that enables SDA to handle latent states with unknown parameters. A key feature of the proposed method is the incorporation of the self-organization technique, which has been used in classical state-space modeling for the joint estimation of latent states and parameters. By integrating this classical technique into modern SDA, our method enables joint inference of latent states and unknown parameters while maintaining the high training efficiency of SDA. The effectiveness of the proposed approach is validated through numerical experiments on dynamical systems arising in neuroscience and atmospheric science. In addition, its scalability is demonstrated using a high-dimensional Kolmogorov flow, with the data dimension on the order of several hundred thousand.
翻译:状态空间模型是一种从观测时间序列数据中推断潜在状态变量的统计框架。然而,处理非线性高维状态空间模型仍具有挑战性。为此,研究人员开发了基于扩散模型(一类强大的深度生成模型)的得分数据同化(SDA)方法。但现有SDA方法无法直接处理潜在状态转移依赖于未知参数、需要与状态变量联合推断的问题。为突破这一局限,我们提出了一种使SDA能够处理含未知参数状态变量的框架。该方法的创新点在于引入自组织技术——这一经典状态空间建模中用于联合估计潜在状态和参数的技术。通过将经典技术与现代SDA方法融合,我们的方法在保持SDA高训练效率的同时,实现了潜在状态与未知参数的联合推断。通过神经科学和大气科学领域的动力学系统数值实验验证了方法的有效性。此外,在数据维度高达数十万量级的高维Kolmogorov流动算例中,证明了其可扩展性。