Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties. Such information is coined as guidance. For example, in text-to-image synthesis, text input is encoded as guidance to generate semantically aligned images. Proper guidance inputs are closely tied to the performance of diffusion models. A common observation is that strong guidance promotes a tight alignment to the task-specific information, while reducing the diversity of the generated samples. In this paper, we provide the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models. Under mild conditions, we prove that incorporating diffusion guidance not only boosts classification confidence but also diminishes distribution diversity, leading to a reduction in the differential entropy of the output distribution. Our analysis covers the widely adopted sampling schemes including DDPM and DDIM, and leverages comparison inequalities for differential equations as well as the Fokker-Planck equation that characterizes the evolution of probability density function, which may be of independent theoretical interest.
翻译:扩散模型通过将任务特定信息融入得分函数,以引导样本生成朝向期望属性,此类信息被称为引导。例如在文本到图像合成中,文本输入被编码为引导以生成语义对齐的图像。恰当的引导输入与扩散模型性能密切相关。常见观察表明,强引导会促进与任务特定信息的紧密对齐,同时降低生成样本的多样性。本文首次对高斯混合模型背景下引导对扩散模型影响开展理论研究。在温和条件下,我们证明引入扩散引导不仅能提升分类置信度,还会削弱分布多样性,导致输出分布微分熵降低。我们的分析涵盖了包括DDPM和DDIM在内的广泛采用的采样方案,并利用微分方程的比较不等式以及刻画概率密度函数演化的福克-普朗克方程,这些工具可能具有独立的理论价值。