Hierarchical Variational Autoencoders (VAEs) are among the most popular likelihood-based generative models. There is rather a consensus that the top-down hierarchical VAEs allow to effectively learn deep latent structures and avoid problems like the posterior collapse. Here, we show that it is not necessarily the case and the problem of collapsing posteriors remains. To discourage the posterior collapse, we propose a new deep hierarchical VAE with a partly fixed encoder, specifically, we use Discrete Cosine Transform to obtain top latent variables. In a series of experiments, we observe that the proposed modification allows us to achieve better utilization of the latent space. Further, we demonstrate that the proposed approach can be useful for compression and robustness to adversarial attacks.
翻译:层级变分自编码器(Hierarchical VAEs)是最流行的基于似然的生成模型之一。学界普遍认为,自上而下的层级VAE能够有效学习深层潜在结构,并避免后验坍缩等问题。然而,我们在此证明事实并非必然如此,后验坍缩问题依然存在。为抑制后验坍缩,我们提出了一种新型深度层级VAE,其编码器部分固定,具体而言,我们采用离散余弦变换来获取顶层潜在变量。在一系列实验中,我们观察到该改进方案能更有效地利用潜在空间。此外,我们证明了所提方法在压缩性和对抗攻击鲁棒性方面具有实用价值。