Diffusion models are the state-of-the-art generative models for high-resolution images, but sampling from pretrained models is computationally expensive, motivating interest in fast sampling. Although Free-U Net is a training-free enhancement for improving image quality, we find it ineffective under few-step ($<10$) sampling. We analyze the discrete diffusion ODE and propose F-scheduler, a scheduler designed for ODE solvers with Free-U Net. Our proposed scheduler consists of a special time schedule that does not fully denoise the feature to enable the use of the KL-term in the $β$-VAE decoder, and the schedule of a proper inference stage for modifying the U-Net skip-connection via Free-U Net. Via information theory, we provide insights into how the better scheduled ODE solvers for the diffusion model can outperform the training-based diffusion distillation model. The newly proposed scheduler is compatible with most of the few-step ODE solvers and can sample a 1024 x 1024-resolution image in 6 steps and a 512 x 512-resolution image in 5 steps when it applies to DPM++ 2m and UniPC, with an FID result that outperforms the SOTA distillation models and the 20-step DPM++ 2m solver, respectively. Codebase: https://github.com/TheLovesOfLadyPurple/F-scheduler
翻译:扩散模型是当前生成高分辨率图像的最先进生成模型,但从预训练模型中进行采样计算成本高昂,这激发了人们对快速采样的兴趣。尽管Free-U Net是一种无需训练即可提升图像质量的增强方法,但我们发现其在少步数($<10$)采样场景下效果不佳。我们分析了离散扩散常微分方程,并提出F-scheduler——一种专为结合Free-U Net的ODE求解器设计的调度器。我们提出的调度器包含一个特殊的时间调度方案:该方案不会对特征进行完全去噪,从而能够利用$β$-VAE解码器中的KL项;同时包含一个恰当的推理阶段调度,用于通过Free-U Net修改U-Net跳跃连接。基于信息论,我们阐释了为何经过更好调度的扩散模型ODE求解器能够超越基于训练的扩散蒸馏模型。新提出的调度器与多数少步数ODE求解器兼容,当应用于DPM++ 2m和UniPC时,仅需6步即可采样1024×1024分辨率图像,5步即可采样512×512分辨率图像,其FID结果分别优于当前最先进的蒸馏模型和20步DPM++ 2m求解器。代码库:https://github.com/TheLovesOfLadyPurple/F-scheduler