Deep generative models are a prominent approach for data generation, and have been used to produce high quality samples in various domains. Diffusion models, an emerging class of deep generative models, have attracted considerable attention owing to their exceptional generative quality. Despite this, they have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space. This survey presents a plethora of advanced techniques aimed at enhancing diffusion models, including sampling acceleration and the design of new diffusion processes. In addition, we delve into strategies for implementing diffusion models in manifold and discrete spaces, maximum likelihood training for diffusion models, and methods for creating bridges between two arbitrary distributions. The innovations we discuss represent the efforts for improving the functionality and efficiency of diffusion models in recent years. To examine the efficacy of existing models, a benchmark of FID score, IS, and NLL is presented in a specific NFE. Furthermore, diffusion models are found to be useful in various domains such as computer vision, audio, sequence modeling, and AI for science. The paper concludes with a summary of this field, along with existing limitations and future directions. Summation of existing well-classified methods is in our Github: https://github.com/chq1155/A-Survey-on-Generative-Diffusion-Model
翻译:深度生成模型是数据生成的一种重要方法,已被用于在多个领域产生高质量样本。扩散模型作为新兴的深度生成模型类别,因其卓越的生成质量而广受关注。尽管如此,其仍存在某些局限性,包括耗时的迭代生成过程以及仅局限于高维欧几里得空间。本综述介绍了大量旨在改进扩散模型的先进技术,包括采样加速和新扩散过程设计。此外,我们深入探讨了在流形空间和离散空间中实现扩散模型的策略、扩散模型的最大似然训练方法,以及创建任意两个分布之间桥梁的技术。所讨论的创新代表了近年来为提升扩散模型功能性和效率所做的努力。为检验现有模型的有效性,本文在特定NFE下给出了FID分数、IS和NLL的基准测试结果。此外,扩散模型在计算机视觉、音频、序列建模和人工智能科学等多个领域展现出实用价值。本文最后对该领域进行了总结,并指出了现有局限性和未来方向。现有分类方法的汇总请见我们的GitHub仓库:https://github.com/chq1155/A-Survey-on-Generative-Diffusion-Model