In the field of single image super-resolution (SISR), transformer-based models, have demonstrated significant advancements. However, the potential and efficiency of these models in applied fields such as real-world image super-resolution have been less noticed and there are substantial opportunities for improvement. Recently, composite fusion attention transformer (CFAT), outperformed previous state-of-the-art (SOTA) models in classic image super-resolution. In this paper, we propose a novel GAN-based framework by incorporating the CFAT model to effectively exploit the performance of transformers in real-world image super-resolution. In our proposed approach, we integrate a semantic-aware discriminator to reconstruct fine details more accurately and employ an adaptive degradation model to better simulate real-world degradations. Moreover, we introduce a new combination of loss functions by adding wavelet loss to loss functions of GAN-based models to better recover high-frequency details. Empirical results demonstrate that IG-CFAT significantly outperforms existing SOTA models in both quantitative and qualitative metrics. Our proposed model revolutionizes the field of real-world image super-resolution and demonstrates substantially better performance in recovering fine details and generating realistic textures. The introduction of IG-CFAT offers a robust and adaptable solution for real-world image super-resolution tasks.
翻译:在单图像超分辨率(SISR)领域,基于Transformer的模型已展现出显著进展。然而,这些模型在真实世界图像超分辨率等应用领域的潜力与效率尚未得到充分关注,存在巨大的改进空间。近期,复合融合注意力Transformer(CFAT)在经典图像超分辨率任务中超越了以往的最先进(SOTA)模型。本文提出了一种新颖的基于GAN的框架,通过融入CFAT模型,以有效发挥Transformer在真实世界图像超分辨率中的性能。在我们提出的方法中,我们集成了语义感知判别器以更精确地重建细节,并采用自适应退化模型以更好地模拟真实世界退化。此外,我们通过向基于GAN模型的损失函数中添加小波损失,引入了一种新的损失函数组合,以更好地恢复高频细节。实验结果表明,IG-CFAT在定量与定性指标上均显著优于现有SOTA模型。我们提出的模型革新了真实世界图像超分辨率领域,并在恢复精细细节与生成逼真纹理方面展现出明显更优的性能。IG-CFAT的引入为真实世界图像超分辨率任务提供了一个鲁棒且适应性强的解决方案。