This paper introduces a new latent variable generative model able to handle high dimensional longitudinal data and relying on variational inference. The time dependency between the observations of an input sequence is modelled using normalizing flows over the associated latent variables. The proposed method can be used to generate either fully synthetic longitudinal sequences or trajectories that are conditioned on several data in a sequence and demonstrates good robustness properties to missing data. We test the model on 6 datasets of different complexity and show that it can achieve better likelihood estimates than some competitors as well as more reliable missing data imputation. A code is made available at \url{https://github.com/clementchadebec/variational_inference_for_longitudinal_data}.
翻译:本文提出了一种新的潜变量生成模型,该模型能够处理高维纵向数据,并基于变分推断方法。输入序列中观测值之间的时间依赖性通过对相关潜变量应用归一化流进行建模。所提出的方法可用于生成完全合成的纵向序列,或基于序列中若干数据点条件生成的轨迹,并在缺失数据情况下表现出良好的鲁棒性。我们在6个不同复杂度的数据集上测试了该模型,结果表明,与部分竞争方法相比,该模型能获得更优的似然估计,并提供更可靠的缺失数据插补。相关代码发布于\url{https://github.com/clementchadebec/variational_inference_for_longitudinal_data}。