Wasserstein distances greatly influenced and coined various types of generative neural network models. Wasserstein autoencoders are particularly notable for their mathematical simplicity and straight-forward implementation. However, their adaptation to the conditional case displays theoretical difficulties. As a remedy, we propose the use of two paired autoencoders. Under the assumption of an optimal autoencoder pair, we leverage the pairwise independence condition of our prescribed Gaussian latent distribution to overcome this theoretical hurdle. We conduct several experiments to showcase the practical applicability of the resulting paired Wasserstein autoencoders. Here, we consider imaging tasks and enable conditional sampling for denoising, inpainting, and unsupervised image translation. Moreover, we connect our image translation model to the Monge map behind Wasserstein-2 distances.
翻译:Wasserstein距离对各类生成式神经网络模型产生了深远影响并催生了多种变体。其中,Wasserstein自编码器因其数学简洁性和直观的实现方式而备受关注。然而,其在条件场景下的适配存在理论困难。为此,我们提出使用两个配对的自编码器。在假设自编码器对达到最优的前提下,我们利用预设高斯隐分布的两两独立性条件来克服这一理论障碍。我们通过多组实验验证了所得配对Wasserstein自编码器的实际应用价值。在成像任务中,我们实现了去噪、修复和无监督图像翻译的条件采样。此外,我们将图像翻译模型与Wasserstein-2距离背后的Monge映射建立了理论关联。