While conditional diffusion models are known to have good coverage of the data distribution, they still face limitations in output diversity, particularly when sampled with a high classifier-free guidance scale for optimal image quality or when trained on small datasets. We attribute this problem to the role of the conditioning signal in inference and offer an improved sampling strategy for diffusion models that can increase generation diversity, especially at high guidance scales, with minimal loss of sample quality. Our sampling strategy anneals the conditioning signal by adding scheduled, monotonically decreasing Gaussian noise to the conditioning vector during inference to balance diversity and condition alignment. Our Condition-Annealed Diffusion Sampler (CADS) can be used with any pretrained model and sampling algorithm, and we show that it boosts the diversity of diffusion models in various conditional generation tasks. Further, using an existing pretrained diffusion model, CADS achieves a new state-of-the-art FID of 1.70 and 2.31 for class-conditional ImageNet generation at 256$\times$256 and 512$\times$512 respectively.
翻译:尽管条件扩散模型已知能良好覆盖数据分布,但其在输出多样性方面仍存在局限,尤其是在为获得最优图像质量而采用高无分类器引导尺度进行采样时,或在小型数据集上训练时。我们将此问题归因于推理过程中条件信号的作用,并提出一种改进的扩散模型采样策略,该策略能够在几乎不牺牲样本质量的前提下提升生成多样性,尤其适用于高引导尺度场景。我们的采样策略通过在推理过程中向条件向量添加按计划单调递减的高斯噪声来退火条件信号,从而在多样性与条件对齐之间取得平衡。所提出的条件退火扩散采样器(CADS)可与任何预训练模型及采样算法配合使用,我们证明它在多种条件生成任务中能增强扩散模型的多样性。此外,基于现有预训练扩散模型,CADS在类条件ImageNet生成任务中实现了新的最优FID分数:256×256分辨率下为1.70,512×512分辨率下为2.31。