DAISI: Data Assimilation with Inverse Sampling using Stochastic Interpolants

Data assimilation (DA) is a cornerstone of scientific and engineering applications, combining model forecasts with sparse and noisy observations to estimate latent system states. Classical high-dimensional DA methods, such as the ensemble Kalman filter, rely on Gaussian approximations that are violated for complex dynamics or observation operators. To address this limitation, we introduce DAISI, a scalable filtering algorithm built on flow-based generative models that enables flexible probabilistic inference using data-driven priors. The core idea is to use a stationary, pre-trained generative prior that first incorporates forecast information through a novel inverse-sampling step, before assimilating observations via guidance-based conditional sampling. This allows us to leverage any forecasting model as part of the DA pipeline without having to retrain or fine-tune the generative prior at each assimilation step. Experiments on challenging nonlinear systems show that DAISI achieves accurate filtering results in regimes with sparse, noisy, and nonlinear observations where traditional methods struggle. The code for DAISI is available at https://github.com/Erik-Wikingsson/DAISI.

翻译：数据同化是科学与工程应用的基石，通过结合模型预测与稀疏含噪观测来估计潜在系统状态。经典的高维数据同化方法（如集合卡尔曼滤波）依赖高斯近似，但复杂动态系统或观测算子往往违背该假设。为此，我们提出DAISI——一种基于流生成模型的可扩展滤波算法，通过数据驱动先验实现灵活的概率推理。其核心思想是：首先利用静态预训练的生成先验，通过新颖的逆采样步骤融入预报信息，再通过引导条件采样同化观测数据。该方法无需在每个同化步骤中重新训练或微调生成先验，即可将任意预报模型纳入数据同化流程。在挑战性非线性系统上的实验表明，DAISI能在传统方法难以应对的稀疏、含噪及非线性观测场景中取得精确滤波结果。DAISI代码开源地址：https://github.com/Erik-Wikingsson/DAISI。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR2026】DiverseDiT: 迈向扩散 Transformer 中的多样化表示学习

专知会员服务

8+阅读 · 3月9日

《数据创新：桥接传统方法与大型语言模型以应对罕见高影响事件》最新报告

专知会员服务

18+阅读 · 2月25日

【NeurIPS2025】MIDAS：一种基于错配的用于失衡多模态学习的数据增强策略

专知会员服务

10+阅读 · 2025年10月1日

用Transformer学习通用超参数优化器，DeepMind Yutian Chen博士讲授，附Slides与视频

专知会员服务

40+阅读 · 2023年3月12日