The posterior collapse phenomenon in variational autoencoders (VAEs), where the variational posterior distribution closely matches the prior distribution, can hinder the quality of the learned latent variables. As a consequence of posterior collapse, the latent variables extracted by the encoder in VAEs preserve less information from the input data and thus fail to produce meaningful representations as input to the reconstruction process in the decoder. While this phenomenon has been an actively addressed topic related to VAEs performance, the theory for posterior collapse remains underdeveloped, especially beyond the standard VAEs. In this work, we advance the theoretical understanding of posterior collapse to two important and prevalent yet less studied classes of VAEs: conditional VAEs and hierarchical VAEs. Specifically, via a non-trivial theoretical analysis of linear conditional VAEs and hierarchical VAEs with two levels of latent, we prove that the cause of posterior collapses in these models includes the correlation between the input and output of the conditional VAEs and the effect of learnable encoder variance in the hierarchical VAEs. We empirically validate our theoretical findings for linear conditional and hierarchical VAEs and demonstrate that these results are also predictive for non-linear cases.
翻译:变分自编码器(VAEs)中的后验坍缩现象,即变分后验分布与先验分布高度吻合,会损害所学潜变量的质量。后验坍缩导致VAE编码器提取的潜变量保留输入数据的信息减少,从而无法为解码器的重建过程产生有意义的表征。尽管这一现象已成为影响VAE性能的活跃研究课题,但关于后验坍缩的理论仍不成熟,尤其对于标准VAE之外的其他变体。本研究将后验坍缩的理论理解拓展至两类重要且普遍但研究较少的VAE变体:条件VAE和分层VAE。具体而言,通过对线性条件VAE和两层潜变量的分层VAE进行非平凡的理论分析,我们证明这些模型中后验坍缩的成因包括条件VAE中输入与输出的相关性以及分层VAE中可学习编码器方差的影响。我们通过实验验证了线性条件VAE和分层VAE的理论发现,并证明这些结论对非线性情形同样具有预测性。