Super-resolution (SR) is an ill-posed inverse problem, where the size of the set of feasible solutions that are consistent with a given low-resolution image is very large. Many algorithms have been proposed to find a "good" solution among the feasible solutions that strike a balance between fidelity and perceptual quality. Unfortunately, all known methods generate artifacts and hallucinations while trying to reconstruct high-frequency (HF) image details. A fundamental question is: Can a model learn to distinguish genuine image details from artifacts? Although some recent works focused on the differentiation of details and artifacts, this is a very challenging problem and a satisfactory solution is yet to be found. This paper shows that the characterization of genuine HF details versus artifacts can be better learned by training GAN-based SR models using wavelet-domain loss functions compared to RGB-domain or Fourier-space losses. Although wavelet-domain losses have been used in the literature before, they have not been used in the context of the SR task. More specifically, we train the discriminator only on the HF wavelet sub-bands instead of on RGB images and the generator is trained by a fidelity loss over wavelet subbands to make it sensitive to the scale and orientation of structures. Extensive experimental results demonstrate that our model achieves better perception-distortion trade-off according to multiple objective measures and visual evaluations.
翻译:超分辨率(SR)是一个不适定的逆问题,与给定低分辨率图像一致的可行解集合规模极大。已有许多算法致力于在可行解中寻找兼顾保真度与感知质量的"优质"解。遗憾的是,所有已知方法在尝试重建高频图像细节时均会产生伪影和幻觉。一个根本性问题是:模型能否学会区分真实图像细节与伪影?尽管近期部分工作聚焦于细节与伪影的鉴别,但这一极具挑战性的问题仍未有令人满意的解决方案。本文证明,相较于RGB域或傅里叶空间损失函数,使用小波域损失函数训练基于GAN的SR模型能更有效地学习真实高频细节与伪影的特征刻画。虽然小波域损失先前已在文献中被使用,但尚未应用于SR任务场景。具体而言,我们仅在高频小波子带上训练判别器而非使用RGB图像,生成器则通过小波子带的保真度损失进行训练,使其对结构的尺度和方向敏感。大量实验结果表明,根据多项客观指标和视觉评估,我们的模型实现了更优的感知-失真权衡。