While conditional diffusion models are known to have good coverage of the data distribution, they still face limitations in output diversity, particularly when sampled with a high classifier-free guidance scale for optimal image quality or when trained on small datasets. We attribute this problem to the role of the conditioning signal in inference and offer an improved sampling strategy for diffusion models that can increase generation diversity, especially at high guidance scales, with minimal loss of sample quality. Our sampling strategy anneals the conditioning signal by adding scheduled, monotonically decreasing Gaussian noise to the conditioning vector during inference to balance diversity and condition alignment. Our Condition-Annealed Diffusion Sampler (CADS) can be used with any pretrained model and sampling algorithm, and we show that it boosts the diversity of diffusion models in various conditional generation tasks. Further, using an existing pretrained diffusion model, CADS achieves a new state-of-the-art FID of 1.70 and 2.31 for class-conditional ImageNet generation at 256$\times$256 and 512$\times$512 respectively.
翻译:尽管条件扩散模型被认为能很好地覆盖数据分布,但在输出多样性方面仍存在局限,尤其是当以高无分类器引导尺度采样以获得最优图像质量时,或在小型数据集上训练时。我们将此问题归因于推理过程中条件信号的作用,并提出一种改进的扩散模型采样策略,该策略能够提升生成多样性——特别是在高引导尺度下——且样本质量损失极小。我们的采样策略通过在推理过程中向条件向量添加预定的单调递减高斯噪声来退火条件信号,从而平衡多样性与条件对齐。我们的条件退火扩散采样器(CADS)可与任何预训练模型及采样算法配合使用,实验表明,该策略能在多种条件生成任务中提升扩散模型的多样性。此外,基于现有预训练扩散模型,CADS在类别条件ImageNet生成任务中实现了新的最优FID分数:256×256分辨率下为1.70,512×512分辨率下为2.31。