In this paper, we uncover the untapped potential of diffusion U-Net, which serves as a "free lunch" that substantially improves the generation quality on the fly. We initially investigate the key contributions of the U-Net architecture to the denoising process and identify that its main backbone primarily contributes to denoising, whereas its skip connections mainly introduce high-frequency features into the decoder module, causing the network to overlook the backbone semantics. Capitalizing on this discovery, we propose a simple yet effective method-termed "FreeU" - that enhances generation quality without additional training or finetuning. Our key insight is to strategically re-weight the contributions sourced from the U-Net's skip connections and backbone feature maps, to leverage the strengths of both components of the U-Net architecture. Promising results on image and video generation tasks demonstrate that our FreeU can be readily integrated to existing diffusion models, e.g., Stable Diffusion, DreamBooth, ModelScope, Rerender and ReVersion, to improve the generation quality with only a few lines of code. All you need is to adjust two scaling factors during inference. Project page: https://chenyangsi.top/FreeU/.
翻译:本文揭示了扩散U-Net中尚未开发的潜力,该结构作为一种"免费午餐",能够即时显著提升生成质量。我们首先探究了U-Net架构在去噪过程中的关键贡献,发现其主干网络主要执行去噪任务,而跳跃连接则主要将高频特征引入解码模块,导致网络忽视了主干语义信息。基于这一发现,我们提出了一种名为"FreeU"的简单而有效的方法,无需额外训练或微调即可提升生成质量。核心思路是通过策略性重新加权U-Net跳跃连接与主干特征图的贡献,充分发挥U-Net架构中两个组件的优势。在图像和视频生成任务上的出色结果表明,FreeU可轻松集成至现有扩散模型(如Stable Diffusion、DreamBooth、ModelScope、Rerender和ReVersion),仅需数行代码即可提升生成质量。推理时仅需调整两个缩放因子。项目页面:https://chenyangsi.top/FreeU/。