There is a bias in the inference pipeline of most diffusion models. This bias arises from a signal leak whose distribution deviates from the noise distribution, creating a discrepancy between training and inference processes. We demonstrate that this signal-leak bias is particularly significant when models are tuned to a specific style, causing sub-optimal style matching. Recent research tries to avoid the signal leakage during training. We instead show how we can exploit this signal-leak bias in existing diffusion models to allow more control over the generated images. This enables us to generate images with more varied brightness, and images that better match a desired style or color. By modeling the distribution of the signal leak in the spatial frequency and pixel domains, and including a signal leak in the initial latent, we generate images that better match expected results without any additional training.
翻译:大多数扩散模型的推理流程存在一个偏差。该偏差源于信号泄露,其分布偏离噪声分布,导致训练与推理过程产生差异。我们证明,当模型针对特定风格进行调优时,这种信号泄露偏差尤为显著,会造成风格匹配欠佳。近期研究试图在训练过程中避免信号泄露,而本文则展示了如何在现有扩散模型中利用该信号泄露偏差,以增强对生成图像的控制能力。这使我们能够生成亮度变化更丰富的图像,以及更贴合目标风格或色彩的图像。通过在空间频率域和像素域对信号泄露分布进行建模,并在初始潜变量中包含信号泄露,我们无需额外训练即可生成更符合预期结果的图像。