Wasserstein autoencoder (WAE) shows that matching two distributions is equivalent to minimizing a simple autoencoder (AE) loss under the constraint that the latent space of this AE matches a pre-specified prior distribution. This latent space distribution matching is a core component of WAE, and a challenging task. In this paper, we propose to use the contrastive learning framework that has been shown to be effective for self-supervised representation learning, as a means to resolve this problem. We do so by exploiting the fact that contrastive learning objectives optimize the latent space distribution to be uniform over the unit hyper-sphere, which can be easily sampled from. We show that using the contrastive learning framework to optimize the WAE loss achieves faster convergence and more stable optimization compared with existing popular algorithms for WAE. This is also reflected in the FID scores on CelebA and CIFAR-10 datasets, and the realistic generated image quality on the CelebA-HQ dataset.
翻译:Wasserstein自编码器(WAE)表明,在AE的潜在空间匹配预设先验分布的约束下,最小化简单自编码器(AE)损失等价于匹配两个分布。这种潜在空间分布匹配是WAE的核心组成部分,也是一项具有挑战性的任务。本文提出利用已被证明在自监督表示学习中有效的对比学习框架来解决该问题。我们利用对比学习目标优化潜在空间分布使其在单位超球面上均匀分布(易于从中采样)这一特性来实现这一点。研究表明,与现有WAE流行算法相比,使用对比学习框架优化WAE损失能实现更快的收敛和更稳定的优化。这一点在CelebA和CIFAR-10数据集上的FID分数以及CelebA-HQ数据集上逼真的生成图像质量中得到了体现。