Generative Models (GMs) have attracted considerable attention due to their tremendous success in various domains, such as computer vision where they are capable to generate impressive realistic-looking images. Likelihood-based GMs are attractive due to the possibility to generate new data by a single model evaluation. However, they typically achieve lower sample quality compared to state-of-the-art score-based diffusion models (DMs). This paper provides a significant step in the direction of addressing this limitation. The idea is to borrow one of the strengths of score-based DMs, which is the ability to perform accurate density estimation in low-density regions and to address manifold overfitting by means of data mollification. We connect data mollification through the addition of Gaussian noise to Gaussian homotopy, which is a well-known technique to improve optimization. Data mollification can be implemented by adding one line of code in the optimization loop, and we demonstrate that this provides a boost in generation quality of likelihood-based GMs, without computational overheads. We report results on image data sets with popular likelihood-based GMs, including variants of variational autoencoders and normalizing flows, showing large improvements in FID score.
翻译:生成式模型(GM)因在计算机视觉等领域能生成逼真图像等显著成就而备受关注。基于似然的生成式模型因其单次模型评估即可生成新数据的特性具有吸引力,但其生成样本质量通常低于基于分数的扩散模型(DM)。本文在解决这一局限方面迈出重要一步:借鉴基于分数的扩散模型在低密度区域进行精确密度估计,以及通过数据平滑处理克服流形过拟合的优势。我们将通过添加高斯噪声实现的数据平滑与高斯同伦(一种成熟的优化改进技术)建立联系。通过在优化循环中添加一行代码即可实现数据平滑,实验证明此举能提升基于似然的生成式模型的生成质量,且不产生额外计算开销。我们在主流似然生成式模型(包括变分自编码器和归一化流变体)上对图像数据集进行测试,FID评分取得显著提升。