Image super-resolution generation aims to generate a high-resolution image from its low-resolution image. However, more complex neural networks bring high computational costs and memory storage. It is still an active area for offering the promise of overcoming resolution limitations in many applications. In recent years, transformers have made significant progress in computer vision tasks as their robust self-attention mechanism. However, recent works on the transformer for image super-resolution also contain convolution operations. We propose a patch translator for image super-resolution (PTSR) to address this problem. The proposed PTSR is a transformer-based GAN network with no convolution operation. We introduce a novel patch translator module for regenerating the improved patches utilising multi-head attention, which is further utilised by the generator to generate the 2x and 4x super-resolution images. The experiments are performed using benchmark datasets, including DIV2K, Set5, Set14, and BSD100. The results of the proposed model is improved on an average for $4\times$ super-resolution by 21.66% in PNSR score and 11.59% in SSIM score, as compared to the best competitive models. We also analyse the proposed loss and saliency map to show the effectiveness of the proposed method.
翻译:图像超分辨率生成旨在从低分辨率图像生成高分辨率图像。然而,日益复杂的神经网络带来了高昂的计算成本和内存开销。如何在众多应用中突破分辨率限制,仍是该领域的研究热点。近年来,Transformer凭借其强大的自注意力机制在计算机视觉任务中取得了显著进展,但近期基于Transformer的图像超分辨率方法仍包含卷积操作。为解决该问题,我们提出了面向图像超分辨率的块翻译器(PTSR)。所提出的PTSR是一种无卷积操作的Transformer生成对抗网络。我们引入了一个新颖的块翻译模块,利用多头注意力机制生成更优的图像块,该模块进一步被生成器用于生成2倍和4倍超分辨率图像。实验基于DIV2K、Set5、Set14和BSD100等基准数据集进行。与最优竞争模型相比,所提模型在4倍超分辨率任务中平均PSNR指标提升21.66%,SSIM指标提升11.59%。我们还通过所提损失函数和显著性图分析证明了方法的有效性。