Conditional diffusion models are powerful generative models that can leverage various types of conditional information, such as class labels, segmentation masks, or text captions. However, in many real-world scenarios, conditional information may be noisy or unreliable due to human annotation errors or weak alignment. In this paper, we propose the Coherence-Aware Diffusion (CAD), a novel method that integrates coherence in conditional information into diffusion models, allowing them to learn from noisy annotations without discarding data. We assume that each data point has an associated coherence score that reflects the quality of the conditional information. We then condition the diffusion model on both the conditional information and the coherence score. In this way, the model learns to ignore or discount the conditioning when the coherence is low. We show that CAD is theoretically sound and empirically effective on various conditional generation tasks. Moreover, we show that leveraging coherence generates realistic and diverse samples that respect conditional information better than models trained on cleaned datasets where samples with low coherence have been discarded.
翻译:条件扩散模型是强大的生成模型,能够利用多种类型的条件信息,例如类别标签、分割掩码或文本描述。然而,在许多现实场景中,由于人工标注错误或弱对齐,条件信息可能具有噪声或不可靠。本文提出一致性感知扩散(CAD),这是一种将条件信息的一致性整合到扩散模型中的新方法,使其能够从不完美的标注中学习而无需丢弃数据。我们假设每个数据点都有一个关联的一致性分数,用于反映条件信息的质量。随后,我们让扩散模型同时以条件信息和一致性分数为条件。通过这种方式,模型学会在一致性较低时忽略或降低对条件信息的依赖。我们证明CAD在理论上是合理的,并在多种条件生成任务上经验性地验证了其有效性。此外,我们发现,利用一致性分数生成的样本不仅真实且多样,并且比在丢弃低一致性样本的清洗数据集上训练的模型更能尊重条件信息。