Data assimilation is the process of estimating the time-evolving state of a dynamical system by integrating model predictions and noisy observations. It is commonly formulated as Bayesian filtering, but classical filters often struggle with accuracy or computational feasibility in high dimensions. Recently, score-based generative models have emerged as a scalable approach for high-dimensional data assimilation, enabling accurate modeling and sampling of complex distributions. However, existing score-based filters often specify the forward process independently of the data assimilation. As a result, the measurement-update step depends on heuristic approximations of the likelihood score, which can accumulate errors and degrade performance over time. Here, we propose a measurement-aware score-based filter (MASF) that defines a measurement-aware forward process directly from the measurement equation. This construction makes the likelihood score analytically tractable: for linear measurements, we derive the exact likelihood score and combine it with a learned prior score to obtain the posterior score. Numerical experiments covering a range of settings, including high-dimensional datasets, demonstrate improved accuracy and stability over existing score-based filters.
翻译:数据同化是通过融合模型预测与含噪观测来估计动态系统状态随时间演变的过程,通常被建模为贝叶斯滤波问题。然而,经典滤波方法在高维场景中常面临精度不足或计算可行性受限的困境。近年来,基于得分的生成模型为高维数据同化提供了一种可扩展的途径,能够对复杂分布进行精确建模与采样。但现有基于得分的滤波器通常独立于数据同化任务定义前向过程,导致测量更新步骤依赖似然得分的启发式近似,这种近似会累积误差并随时间推移降低性能。本文提出一种测量感知得分滤波器(MASF),该方法直接从测量方程出发定义测量感知前向过程。该构造使似然得分具有解析可处理性:针对线性测量情形,我们推导出精确似然得分,并将其与学习所得先验得分结合以得到后验得分。涵盖高维数据集等多场景的数值实验表明,相较于现有基于得分的滤波器,本方法在精度与稳定性上均有显著提升。