DiffIR: Efficient Diffusion Model for Image Restoration

Diffusion model (DM) has achieved SOTA performance by modeling the image synthesis process into a sequential application of a denoising network. However, different from image synthesis, image restoration (IR) has a strong constraint to generate results in accordance with ground-truth. Thus, for IR, traditional DMs running massive iterations on a large model to estimate whole images or feature maps is inefficient. To address this issue, we propose an efficient DM for IR (DiffIR), which consists of a compact IR prior extraction network (CPEN), dynamic IR transformer (DIRformer), and denoising network. Specifically, DiffIR has two training stages: pretraining and training DM. In pretraining, we input ground-truth images into CPEN$_{S1}$ to capture a compact IR prior representation (IPR) to guide DIRformer. In the second stage, we train the DM to directly estimate the same IRP as pretrained CPEN$_{S1}$ only using LQ images. We observe that since the IPR is only a compact vector, DiffIR can use fewer iterations than traditional DM to obtain accurate estimations and generate more stable and realistic results. Since the iterations are few, our DiffIR can adopt a joint optimization of CPEN$_{S2}$, DIRformer, and denoising network, which can further reduce the estimation error influence. We conduct extensive experiments on several IR tasks and achieve SOTA performance while consuming less computational costs. Code is available at \url{https://github.com/Zj-BinXia/DiffIR}.

翻译：扩散模型（DM）通过将图像合成过程建模为去噪网络的顺序应用，已取得最先进（SOTA）性能。然而，与图像合成不同，图像复原（IR）具有强约束条件，要求生成结果与真实值一致。因此，传统DM在大型模型上执行大量迭代以估计整幅图像或特征图的方式，对IR任务而言效率低下。针对此问题，我们提出了一种高效的IR扩散模型（DiffIR），其包含紧凑型IR先验提取网络（CPEN）、动态IR变换器（DIRformer）及去噪网络。具体而言，DiffIR包含两个训练阶段：预训练与DM训练。在预训练阶段，我们将真实图像输入CPEN$_{S1}$，提取紧凑的IR先验表征（IPR）以指导DIRformer。在第二阶段，我们训练DM使其仅利用低质量（LQ）图像直接估计与预训练CPEN$_{S1}$相同的IRP。我们观察到，由于IPR仅为紧凑向量，DiffIR相比传统DM可使用更少迭代次数获得精确估计，并生成更稳定且真实的结果。因迭代次数较少，我们的DiffIR可对CPEN$_{S2}$、DIRformer和去噪网络进行联合优化，从而进一步降低估计误差的影响。我们在多项IR任务上开展了广泛实验，在降低计算成本的同时取得了SOTA性能。代码开源于\url{https://github.com/Zj-BinXia/DiffIR}。

相关内容

关注 14

信息检索杂志（IR）为信息检索的广泛领域中的理论、算法分析和实验的发布提供了一个国际论坛。感兴趣的主题包括对应用程序（例如Web，社交和流媒体，推荐系统和文本档案）的搜索、索引、分析和评估。这包括对搜索中人为因素的研究、桥接人工智能和信息检索以及特定领域的搜索应用程序。官网地址：https://dblp.uni-trier.de/db/journals/ir/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日