Generative models have emerged as a promising technique for producing high-quality images that are indistinguishable from real images. Generative adversarial networks (GANs) and variational autoencoders (VAEs) are two of the most prominent and widely studied generative models. GANs have demonstrated excellent performance in generating sharp realistic images and VAEs have shown strong abilities to generate diverse images. However, GANs suffer from ignoring a large portion of the possible output space which does not represent the full diversity of the target distribution, and VAEs tend to produce blurry images. To fully capitalize on the strengths of both models while mitigating their weaknesses, we employ a Bayesian non-parametric (BNP) approach to merge GANs and VAEs. Our procedure incorporates both Wasserstein and maximum mean discrepancy (MMD) measures in the loss function to enable effective learning of the latent space and generate diverse and high-quality samples. By fusing the discriminative power of GANs with the reconstruction capabilities of VAEs, our novel model achieves superior performance in various generative tasks, such as anomaly detection and data augmentation. Furthermore, we enhance the model's capability by employing an extra generator in the code space, which enables us to explore areas of the code space that the VAE might have overlooked. With a BNP perspective, we can model the data distribution using an infinite-dimensional space, which provides greater flexibility in the model and reduces the risk of overfitting. By utilizing this framework, we can enhance the performance of both GANs and VAEs to create a more robust generative model suitable for various applications.
翻译:生成模型已成为一种有前景的技术,能够生成与真实图像无异的优质图像。生成对抗网络(GANs)和变分自编码器(VAEs)是其中最突出、研究最广泛的两种生成模型。GANs在生成清晰逼真的图像方面表现出色,而VAEs在生成多样化图像方面具有强大能力。然而,GANs存在忽略大部分可能输出空间的问题,导致无法完整代表目标分布的多样性;VAEs则容易产生模糊图像。为充分发挥两者优势并弥补其不足,我们采用贝叶斯非参数(BNP)方法将GANs与VAEs融合。我们的方法在损失函数中同时引入Wasserstein距离和最大均值差异(MMD)度量,从而有效学习潜空间并生成多样化、高质量的样本。通过融合GANs的判别能力与VAEs的重构能力,我们提出的新模型在异常检测、数据增强等各类生成任务中均取得卓越性能。此外,我们在编码空间中额外增加一个生成器来增强模型能力,从而探索VAE可能忽略的编码空间区域。基于BNP视角,我们能够利用无限维空间对数据分布进行建模,这为模型提供了更大的灵活性并降低了过拟合风险。通过运用该框架,我们可提升GANs与VAEs的性能,构建适用于多种场景的鲁棒生成模型。