Incorporating unstructured data into physical models is a challenging problem that is emerging in data assimilation. Traditional approaches focus on well-defined observation operators whose functional forms are typically assumed to be known. This prevents these methods from achieving a consistent model-data synthesis in configurations where the mapping from data-space to model-space is unknown. To address these shortcomings, in this paper we develop a physics-informed dynamical variational autoencoder ($\Phi$-DVAE) to embed diverse data streams into time-evolving physical systems described by differential equations. Our approach combines a standard, possibly nonlinear, filter for the latent state-space model and a VAE, to assimilate the unstructured data into the latent dynamical system. Unstructured data, in our example systems, comes in the form of video data and velocity field measurements, however the methodology is suitably generic to allow for arbitrary unknown observation operators. A variational Bayesian framework is used for the joint estimation of the encoding, latent states, and unknown system parameters. To demonstrate the method, we provide case studies with the Lorenz-63 ordinary differential equation, and the advection and Korteweg-de Vries partial differential equations. Our results, with synthetic data, show that $\Phi$-DVAE provides a data efficient dynamics encoding methodology which is competitive with standard approaches. Unknown parameters are recovered with uncertainty quantification, and unseen data are accurately predicted.
翻译:将非结构化数据融入物理模型是数据同化领域面临的新挑战。传统方法聚焦于假设函数形式已知的精确定义观测算子,这导致当数据空间到模型空间的映射未知时,这些方法无法实现一致的模型-数据协同。为解决上述局限性,本文提出一种基于物理信息的动态变分自编码器($\Phi$-DVAE),旨在将多种数据流嵌入由微分方程描述的时变物理系统。该方法结合标准(可能为非线性)滤波器的隐状态空间模型与变分自编码器(VAE),以实现非结构化数据向隐式动态系统的同化。在我们的示例系统中,非结构化数据以视频数据和速度场观测形式呈现,但该方法具有足够普适性,可兼容任意未知形式的观测算子。通过变分贝叶斯框架,我们实现编码、隐状态及未知系统参数的联合估计。为验证该方法,我们以Lorenz-63常微分方程、对流方程及Korteweg-de Vries偏微分方程为例开展案例研究。基于合成数据的实验结果表明,$\Phi$-DVAE提供了一种数据高效的动态编码方法,其性能可与标准方法媲美。该方法不仅通过不确定性量化恢复了未知参数,还能准确预测未见数据。