While conditional diffusion models are known to have good coverage of the data distribution, they still face limitations in output diversity, particularly when sampled with a high classifier-free guidance scale for optimal image quality or when trained on small datasets. We attribute this problem to the role of the conditioning signal in inference and offer an improved sampling strategy for diffusion models that can increase generation diversity, especially at high guidance scales, with minimal loss of sample quality. Our sampling strategy anneals the conditioning signal by adding scheduled, monotonically decreasing Gaussian noise to the conditioning vector during inference to balance diversity and condition alignment. Our Condition-Annealed Diffusion Sampler (CADS) can be used with any pretrained model and sampling algorithm, and we show that it boosts the diversity of diffusion models in various conditional generation tasks. Further, using an existing pretrained diffusion model, CADS achieves a new state-of-the-art FID of 1.70 and 2.31 for class-conditional ImageNet generation at 256$\times$256 and 512$\times$512 respectively.
翻译:尽管条件扩散模型已知能够较好地覆盖数据分布,但在输出多样性方面仍存在局限性,尤其是在使用高无分类器引导尺度以获得最优图像质量时,或在小数据集上训练时。我们将此问题归因于推理过程中条件信号的作用,并提出一种改进的扩散模型采样策略,能够在高引导尺度下显著提升生成多样性,同时最小化样本质量损失。我们的采样策略通过在推理过程中向条件向量添加按计划单调递减的高斯噪声来退火条件信号,从而平衡多样性与条件对齐。条件退火扩散采样器(CADS)可与任何预训练模型和采样算法配合使用,实验表明其在多种条件生成任务中提升了扩散模型的多样性。此外,利用现有预训练扩散模型,CADS在类别条件ImageNet生成任务中达到了新的最优FID值:256×256分辨率下为1.70,512×512分辨率下为2.31。