HMANet: Hybrid Multi-Axis Aggregation Network for Image Super-Resolution

Transformer-based methods have demonstrated excellent performance on super-resolution visual tasks, surpassing conventional convolutional neural networks. However, existing work typically restricts self-attention computation to non-overlapping windows to save computational costs. This means that Transformer-based networks can only use input information from a limited spatial range. Therefore, a novel Hybrid Multi-Axis Aggregation network (HMA) is proposed in this paper to exploit feature potential information better. HMA is constructed by stacking Residual Hybrid Transformer Blocks(RHTB) and Grid Attention Blocks(GAB). On the one side, RHTB combines channel attention and self-attention to enhance non-local feature fusion and produce more attractive visual results. Conversely, GAB is used in cross-domain information interaction to jointly model similar features and obtain a larger perceptual field. For the super-resolution task in the training phase, a novel pre-training method is designed to enhance the model representation capabilities further and validate the proposed model's effectiveness through many experiments. The experimental results show that HMA outperforms the state-of-the-art methods on the benchmark dataset. We provide code and models at https://github.com/korouuuuu/HMA.

翻译：基于Transformer的方法在超分辨率视觉任务中展现了优于传统卷积神经网络的卓越性能。然而，现有工作通常将自注意力计算限制在非重叠窗口内以节省计算成本，这意味着基于Transformer的网络仅能利用有限空间范围的输入信息。为此，本文提出了一种新颖的混合多轴聚合网络（HMA），以更好地挖掘特征潜在信息。HMA通过堆叠残差混合Transformer块（RHTB）和网格注意力块（GAB）构建。一方面，RHTB结合通道注意力与自注意力，增强非局部特征融合并生成更具吸引力的视觉结果；另一方面，GAB用于跨域信息交互，联合建模相似特征以获得更大的感知场。针对训练阶段的超分辨率任务，本文设计了一种新型预训练方法以进一步提升模型表征能力，并通过大量实验验证了所提模型的有效性。实验结果表明，HMA在基准数据集上优于现有最先进方法。相关代码与模型已在https://github.com/korouuuuu/HMA中开源。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日