Recent advancements in single image super-resolution have been predominantly driven by token mixers and transformer architectures. WaveMixSR utilized the WaveMix architecture, employing a two-dimensional discrete wavelet transform for spatial token mixing, achieving superior performance in super-resolution tasks with remarkable resource efficiency. In this work, we present an enhanced version of the WaveMixSR architecture by (1) replacing the traditional transpose convolution layer with a pixel shuffle operation and (2) implementing a multistage design for higher resolution tasks ($4\times$). Our experiments demonstrate that our enhanced model -- WaveMixSR-V2 -- outperforms other architectures in multiple super-resolution tasks, achieving state-of-the-art for the BSD100 dataset, while also consuming fewer resources, exhibits higher parameter efficiency, lower latency and higher throughput. Our code is available at https://github.com/pranavphoenix/WaveMixSR.
翻译:近年来,单图像超分辨率领域的进展主要源于令牌混合器和Transformer架构。WaveMixSR采用了WaveMix架构,利用二维离散小波变换进行空间令牌混合,在超分辨率任务中实现了卓越的性能和显著的资源效率。本文提出了一种增强版WaveMixSR架构,其改进包括:(1)用像素重排操作替代传统的转置卷积层;(2)针对更高分辨率任务($4\times$)实现多阶段设计。实验表明,我们的增强模型——WaveMixSR-V2——在多项超分辨率任务中优于其他架构,在BSD100数据集上达到最先进水平,同时消耗更少资源,展现出更高的参数效率、更低的延迟和更高的吞吐量。代码发布于https://github.com/pranavphoenix/WaveMixSR。