DiffBFR: Bootstrapping Diffusion Model Towards Blind Face Restoration

Blind face restoration (BFR) is important while challenging. Prior works prefer to exploit GAN-based frameworks to tackle this task due to the balance of quality and efficiency. However, these methods suffer from poor stability and adaptability to long-tail distribution, failing to simultaneously retain source identity and restore detail. We propose DiffBFR to introduce Diffusion Probabilistic Model (DPM) for BFR to tackle the above problem, given its superiority over GAN in aspects of avoiding training collapse and generating long-tail distribution. DiffBFR utilizes a two-step design, that first restores identity information from low-quality images and then enhances texture details according to the distribution of real faces. This design is implemented with two key components: 1) Identity Restoration Module (IRM) for preserving the face details in results. Instead of denoising from pure Gaussian random distribution with LQ images as the condition during the reverse process, we propose a novel truncated sampling method which starts from LQ images with part noise added. We theoretically prove that this change shrinks the evidence lower bound of DPM and then restores more original details. With theoretical proof, two cascade conditional DPMs with different input sizes are introduced to strengthen this sampling effect and reduce training difficulty in the high-resolution image generated directly. 2) Texture Enhancement Module (TEM) for polishing the texture of the image. Here an unconditional DPM, a LQ-free model, is introduced to further force the restorations to appear realistic. We theoretically proved that this unconditional DPM trained on pure HQ images contributes to justifying the correct distribution of inference images output from IRM in pixel-level space. Truncated sampling with fractional time step is utilized to polish pixel-level textures while preserving identity information.

翻译：盲脸修复（Blind Face Restoration, BFR）是一项重要且具有挑战性的任务。先前的工作倾向于利用基于生成对抗网络（GAN）的框架来解决该问题，以平衡质量与效率。然而，这些方法存在稳定性差、对长尾分布适应性不足的问题，无法同时保留源身份信息并恢复细节。我们提出DiffBFR，引入扩散概率模型（DPM）用于BFR，以解决上述问题——鉴于DPM在避免训练崩溃和生成长尾分布方面优于GAN。DiffBFR采用两步设计：首先从低质量图像中恢复身份信息，随后依据真实人脸分布增强纹理细节。该设计通过两个关键组件实现：1) 身份恢复模块（IRM），用于保留结果中的面部细节。不同于在逆向过程中以低质量图像为条件从纯高斯随机分布去噪，我们提出一种新颖的截断采样方法，从添加部分噪声的低质量图像开始采样。我们从理论上证明，这一改变缩小了DPM的证据下界，从而恢复更多原始细节。基于理论证明，我们引入两个不同输入尺寸的级联条件DPM，以增强该采样效果并降低直接生成高分辨率图像时的训练难度。2) 纹理增强模块（TEM），用于优化图像纹理。此处引入一个无低质量图像（LQ-free）的无条件DPM，进一步迫使修复结果呈现真实感。我们从理论上证明，在纯高质量图像上训练的无条件DPM有助于在像素级空间上校正IRM输出推理图像的正确分布。采用分数时间步的截断采样，在保留身份信息的同时优化像素级纹理。