This paper introduces ConStyle v2, a strong plug-and-play prompter designed to output clean visual prompts and assist U-Net Image Restoration models in handling multiple degradations. The joint training process of IRConStyle, an Image Restoration framework consisting of ConStyle and a general restoration network, is divided into two stages: first, pre-training ConStyle alone, and then freezing its weights to guide the training of the general restoration network. Three improvements are proposed in the pre-training stage to train ConStyle: unsupervised pre-training, adding a pretext task (i.e. classification), and adopting knowledge distillation. Without bells and whistles, we can get ConStyle v2, a strong prompter for all-in-one Image Restoration, in less than two GPU days and doesn't require any fine-tuning. Extensive experiments on Restormer (transformer-based), NAFNet (CNN-based), MAXIM-1S (MLP-based), and a vanilla CNN network demonstrate that ConStyle v2 can enhance any U-Net style Image Restoration models to all-in-one Image Restoration models. Furthermore, models guided by the well-trained ConStyle v2 exhibit superior performance in some specific degradation compared to ConStyle.
翻译:本文介绍了ConStyle v2,一种强大的即插即用提示器,旨在输出清晰的视觉提示,并协助U-Net架构的图像复原模型处理多种退化类型。由ConStyle与通用复原网络构成的图像复原框架IRConStyle,其联合训练过程分为两个阶段:首先单独预训练ConStyle,随后冻结其权重以指导通用复原网络的训练。在预训练阶段,我们提出了三项改进来训练ConStyle:无监督预训练、添加代理任务(即分类任务)以及采用知识蒸馏。无需复杂技巧,我们可在不到两个GPU日内获得ConStyle v2——一个适用于一体化图像复原的强效提示器,且无需任何微调。在基于Transformer的Restormer、基于CNN的NAFNet、基于MLP的MAXIM-1S以及基础CNN网络上进行的广泛实验表明,ConStyle v2能够将任意U-Net风格的图像复原模型增强为一体化图像复原模型。此外,由训练完备的ConStyle v2引导的模型,在特定退化类型上展现出优于原版ConStyle的性能。