Image restoration has made marvelous progress with the advent of deep learning. Previous methods usually rely on designing powerful network architecture to elevate performance, however, the natural visual effect of the restored results is limited by color and texture distortions. Besides the visual perceptual quality, the semantic perception recovery is an important but often overlooked perspective of restored image, which is crucial for the deployment in high-level tasks. In this paper, we propose a new perspective to resort these issues by introducing a naturalness-oriented and semantic-aware optimization mechanism, dubbed DiffLoss. Specifically, inspired by the powerful distribution coverage capability of the diffusion model for natural image generation, we exploit the Markov chain sampling property of diffusion model and project the restored results of existing networks into the sampling space. Besides, we reveal that the bottleneck feature of diffusion models, also dubbed h-space feature, is a natural high-level semantic space. We delve into this property and propose a semantic-aware loss to further unlock its potential of semantic perception recovery, which paves the way to connect image restoration task and downstream high-level recognition task. With these two strategies, the DiffLoss can endow existing restoration methods with both more natural and semantic-aware results. We verify the effectiveness of our method on substantial common image restoration tasks and benchmarks. Code will be available at https://github.com/JosephTiTan/DiffLoss.
翻译:随着深度学习的兴起,图像复原领域取得了显著进展。以往方法通常依赖设计强大的网络架构来提升性能,然而复原结果的自然视觉效果常受限于色彩与纹理失真。除视觉感知质量外,语义感知恢复是复原图像中重要却常被忽视的维度,这对高级任务的实际部署至关重要。本文提出一种新视角以解决这些问题:引入一种面向自然度与语义感知的优化机制,称为DiffLoss。具体而言,受扩散模型在自然图像生成中强大分布覆盖能力的启发,我们利用扩散模型的马尔可夫链采样特性,将现有网络的复原结果投影至采样空间。此外,我们发现扩散模型的瓶颈特征(亦称h空间特征)本质上是高阶语义空间。我们深入探究此特性,并提出语义感知损失函数以进一步释放其语义感知恢复潜力,从而搭建连接图像复原任务与下游高级识别任务的桥梁。通过这两种策略,DiffLoss能够使现有复原方法同时获得更自然且更具语义感知的结果。我们在大量常见图像复原任务与基准测试中验证了本方法的有效性。代码将在https://github.com/JosephTiTan/DiffLoss发布。