In this paper, we uncover the untapped potential of diffusion U-Net, which serves as a "free lunch" that substantially improves the generation quality on the fly. We initially investigate the key contributions of the U-Net architecture to the denoising process and identify that its main backbone primarily contributes to denoising, whereas its skip connections mainly introduce high-frequency features into the decoder module, causing the network to overlook the backbone semantics. Capitalizing on this discovery, we propose a simple yet effective method-termed "FreeU" - that enhances generation quality without additional training or finetuning. Our key insight is to strategically re-weight the contributions sourced from the U-Net's skip connections and backbone feature maps, to leverage the strengths of both components of the U-Net architecture. Promising results on image and video generation tasks demonstrate that our FreeU can be readily integrated to existing diffusion models, e.g., Stable Diffusion, DreamBooth, ModelScope, Rerender and ReVersion, to improve the generation quality with only a few lines of code. All you need is to adjust two scaling factors during inference. Project page: https://chenyangsi.top/FreeU/.
翻译:在本文中,我们揭示了扩散U-Net未被开发的潜力,它作为一种“免费馈赠”,能够显著提升即时生成质量。我们首先探究了U-Net架构在去噪过程中的关键贡献,发现其主要骨干网络主要负责去噪,而跳跃连接则主要向解码器模块引入高频特征,导致网络忽略了骨干网络的语义信息。基于这一发现,我们提出了一种简单而有效的方法——命名为“FreeU”——无需额外训练或微调即可提升生成质量。我们的核心洞察在于策略性地重新加权来自U-Net跳跃连接和骨干特征图的贡献,以利用U-Net架构两个组件的优势。在图像和视频生成任务上的优异结果表明,我们的FreeU可以轻松集成到现有扩散模型中,例如Stable Diffusion、DreamBooth、ModelScope、Rerender和ReVersion,仅需几行代码即可改善生成质量。您所需做的仅是在推理过程中调整两个缩放因子。项目主页:https://chenyangsi.top/FreeU/。