State inference and parameter learning in sequential models can be successfully performed with approximation techniques that maximize the evidence lower bound to the marginal log-likelihood of the data distribution. These methods may be referred to as Dynamical Variational Autoencoders, and our specific focus lies on the deep Kalman filter. It has been shown that the ELBO objective can oversimplify data representations, potentially compromising estimation quality. Tighter Monte Carlo objectives have been proposed in the literature to enhance generative modeling performance. For instance, the IWAE objective uses importance weights to reduce the variance of marginal log-likelihood estimates. In this paper, importance sampling is applied to the DKF framework for learning deep Markov models, resulting in the IW-DKF, which shows an improvement in terms of log-likelihood estimates and KL divergence between the variational distribution and the transition model. The framework using the sampled DKF update rule is also accommodated to address sequential state and parameter estimation when working with highly non-linear physics-based models. An experiment with the 3-space Lorenz attractor shows an enhanced generative modeling performance and also a decrease in RMSE when estimating the model parameters and latent states, indicating that tighter MCOs lead to improved state inference performance.
翻译:序列模型中的状态推断与参数学习可通过最大化数据分布边缘对数似然的证据下界(ELBO)的近似技术成功实现。此类方法可称为动态变分自编码器,本文重点聚焦于深度卡尔曼滤波器。已有研究表明,ELBO目标函数可能过度简化数据表示,进而影响估计质量。学界提出了更紧致的蒙特卡罗目标以增强生成建模性能,例如IWAE目标通过重要性权重降低边缘对数似然估计的方差。本文将重要性采样应用于DKF框架以学习深度马尔可夫模型,由此提出IW-DKF方法,该方法在对数似然估计以及变分分布与转移模型之间的KL散度方面均表现出改进。基于采样DKF更新规则的框架同样适用于处理高度非线性物理模型中的序列状态与参数估计。针对三维空间洛伦兹吸引子的实验表明,该方法不仅提升了生成建模性能,还在估计模型参数与隐状态时降低了均方根误差(RMSE),证实了更紧致的蒙特卡罗目标能够提升状态推断性能。