Self-similarity techniques are booming in blind super-resolution (SR) due to accurate estimation of the degradation types involved in low-resolution images. However, high-dimensional matrix multiplication within self-similarity computation prohibitively consumes massive computational costs. We find that the high-dimensional attention map is derived from the matrix multiplication between Query and Key, followed by a softmax function. This softmax makes the matrix multiplication between Query and Key inseparable, posing a great challenge in simplifying computational complexity. To address this issue, we first propose a second-order Taylor expansion approximation (STEA) to separate the matrix multiplication of Query and Key, resulting in the complexity reduction from $\mathcal{O}(N^2)$ to $\mathcal{O}(N)$. Then, we design a multi-scale large field reception (MLFR) to compensate for the performance degradation caused by STEA. Finally, we apply these two core designs to laboratory and real-world scenarios by constructing LabNet and RealNet, respectively. Extensive experimental results tested on five synthetic datasets demonstrate that our LabNet sets a new benchmark in qualitative and quantitative evaluations. Tested on the RealWorld38 dataset, our RealNet achieves superior visual quality over existing methods. Ablation studies further verify the contributions of STEA and MLFR towards both LabNet and RealNet frameworks.
翻译:自相似性技术因能精确估计低分辨率图像所涉及的退化类型,在盲超分辨率重建领域迅速发展。然而,自相似性计算中的高维矩阵乘法会消耗巨大的计算成本。我们发现,高维注意力图源自Query与Key之间的矩阵乘法及随后的softmax函数。该softmax函数使得Query与Key的矩阵乘法不可分离,这为简化计算复杂度带来了巨大挑战。为解决此问题,我们首先提出二阶泰勒展开近似方法,以分离Query与Key的矩阵乘法,从而将计算复杂度从$\mathcal{O}(N^2)$降低至$\mathcal{O}(N)$。随后,我们设计了多尺度大感受野模块,以补偿STEA可能引起的性能下降。最后,我们通过分别构建LabNet与RealNet,将这两项核心设计应用于实验室场景与真实世界场景。在五个合成数据集上的大量实验结果表明,我们的LabNet在定性与定量评估中均设立了新的基准。在RealWorld38数据集上的测试显示,我们的RealNet在视觉质量上优于现有方法。消融实验进一步验证了STEA与MLFR对LabNet和RealNet框架的贡献。