Variational autoencoders (VAEs) are powerful generative modelling methods, however they suffer from blurry generated samples and reconstructions compared to the images they have been trained on. Significant research effort has been spent to increase the generative capabilities by creating more flexible models but often flexibility comes at the cost of higher complexity and computational cost. Several works have focused on altering the reconstruction term of the evidence lower bound (ELBO), however, often at the expense of losing the mathematical link to maximizing the likelihood of the samples under the modeled distribution. Here we propose a new formulation of the reconstruction term for the VAE that specifically penalizes the generation of blurry images while at the same time still maximizing the ELBO under the modeled distribution. We show the potential of the proposed loss on three different data sets, where it outperforms several recently proposed reconstruction losses for VAEs.
翻译:变分自编码器(VAEs)是强大的生成建模方法,然而与训练图像相比,其生成的样本和重建图像往往存在模糊问题。大量研究致力于通过构建更灵活的模型来提升生成能力,但灵活性通常以更高的复杂度和计算成本为代价。部分工作聚焦于修改证据下界(ELBO)中的重建项,但往往牺牲了与最大化建模分布下样本似然之间的数学联系。本文提出了一种VAE重建项的新公式,该公式专门惩罚模糊图像的生成,同时仍能在建模分布下最大化ELBO。我们在三个不同数据集上展示了所提损失函数的潜力,其性能优于近期提出的若干VAE重建损失函数。