An authentic face restoration system is becoming increasingly demanding in many computer vision applications, e.g., image enhancement, video communication, and taking portrait. Most of the advanced face restoration models can recover high-quality faces from low-quality ones but usually fail to faithfully generate realistic and high-frequency details that are favored by users. To achieve authentic restoration, we propose $\textbf{IDM}$, an $\textbf{I}$teratively learned face restoration system based on denoising $\textbf{D}$iffusion $\textbf{M}$odels (DDMs). We define the criterion of an authentic face restoration system, and argue that denoising diffusion models are naturally endowed with this property from two aspects: intrinsic iterative refinement and extrinsic iterative enhancement. Intrinsic learning can preserve the content well and gradually refine the high-quality details, while extrinsic enhancement helps clean the data and improve the restoration task one step further. We demonstrate superior performance on blind face restoration tasks. Beyond restoration, we find the authentically cleaned data by the proposed restoration system is also helpful to image generation tasks in terms of training stabilization and sample quality. Without modifying the models, we achieve better quality than state-of-the-art on FFHQ and ImageNet generation using either GANs or diffusion models.
翻译:一个真实的人脸修复系统在众多计算机视觉应用中正变得日益重要,例如图像增强、视频通信和肖像拍摄。大多数先进的人脸修复模型能从低质量人脸中恢复出高质量人脸,但通常无法忠实生成用户偏好的真实且高频的细节。为了实现真实修复,我们提出$\textbf{IDM}$,一个基于去噪扩散模型(DDMs)的迭代学习人脸修复系统。我们定义了真实人脸修复系统的标准,并论证去噪扩散模型从两个方面天然具备这一特性:内在的迭代细化与外在的迭代增强。内在学习能很好地保留内容,并逐步细化高质量细节,而外在增强则有助于清理数据,进一步提升修复任务的效果。我们在盲人脸修复任务上展示了优越的性能。超越修复之外,我们发现通过所提修复系统真实清理的数据在训练稳定性和样本质量方面也对图像生成任务有益。在不修改模型的情况下,我们在FFHQ和ImageNet生成任务中使用GANs或扩散模型均取得了优于当前最优方法的质量。