Incorporating unstructured data into physical models is a challenging problem that is emerging in data assimilation. Traditional approaches focus on well-defined observation operators whose functional forms are typically assumed to be known. This prevents these methods from achieving a consistent model-data synthesis in configurations where the mapping from data-space to model-space is unknown. To address these shortcomings, in this paper we develop a physics-informed dynamical variational autoencoder ($\Phi$-DVAE) to embed diverse data streams into time-evolving physical systems described by differential equations. Our approach combines a standard, possibly nonlinear, filter for the latent state-space model and a VAE, to assimilate the unstructured data into the latent dynamical system. Unstructured data, in our example systems, comes in the form of video data and velocity field measurements, however the methodology is suitably generic to allow for arbitrary unknown observation operators. A variational Bayesian framework is used for the joint estimation of the encoding, latent states, and unknown system parameters. To demonstrate the method, we provide case studies with the Lorenz-63 ordinary differential equation, and the advection and Korteweg-de Vries partial differential equations. Our results, with synthetic data, show that $\Phi$-DVAE provides a data efficient dynamics encoding methodology which is competitive with standard approaches. Unknown parameters are recovered with uncertainty quantification, and unseen data are accurately predicted.
翻译:将非结构化数据融入物理模型是数据同化领域一个新兴的挑战性问题。传统方法侧重于功能形式通常被假定为已知的、定义明确的观测算子,这导致在数据空间到模型空间的映射关系未知的情况下,这些方法无法实现一致的模型-数据融合。为克服这些局限,本文提出一种物理信息动态变分自编码器(Φ-DVAE),将多样化数据流嵌入到由微分方程描述的时变物理系统中。该方法结合了针对潜状态空间模型的标准(可能非线性)滤波器与变分自编码器,以实现非结构化数据在潜动态系统中的同化。在我们的示例系统中,非结构化数据以视频数据和速度场测量值的形式呈现,但该方法具有足够的通用性,可适用于任意未知观测算子。我们采用变分贝叶斯框架对编码、潜状态及未知系统参数进行联合估计。为验证该方法,我们以Lorenz-63常微分方程、平流方程和Korteweg-de Vries偏微分方程为例开展案例研究。基于合成数据的实验结果表明,Φ-DVAE提供了一种数据高效的动态编码方法,其性能与标准方法相当。该方法能够恢复未知参数并给出不确定性量化结果,同时能准确预测未见数据。