Hybrid Transformer and CNN Attention Network for Stereo Image Super-resolution

Multi-stage strategies are frequently employed in image restoration tasks. While transformer-based methods have exhibited high efficiency in single-image super-resolution tasks, they have not yet shown significant advantages over CNN-based methods in stereo super-resolution tasks. This can be attributed to two key factors: first, current single-image super-resolution transformers are unable to leverage the complementary stereo information during the process; second, the performance of transformers is typically reliant on sufficient data, which is absent in common stereo-image super-resolution algorithms. To address these issues, we propose a Hybrid Transformer and CNN Attention Network (HTCAN), which utilizes a transformer-based network for single-image enhancement and a CNN-based network for stereo information fusion. Furthermore, we employ a multi-patch training strategy and larger window sizes to activate more input pixels for super-resolution. We also revisit other advanced techniques, such as data augmentation, data ensemble, and model ensemble to reduce overfitting and data bias. Finally, our approach achieved a score of 23.90dB and emerged as the winner in Track 1 of the NTIRE 2023 Stereo Image Super-Resolution Challenge.

翻译：多阶段策略在图像恢复任务中常被采用。尽管基于Transformer的方法在单图像超分辨率任务中表现出高效率，但在立体超分辨率任务中尚未展现出对基于CNN方法的显著优势。这归因于两个关键因素：其一，当前单图像超分辨率Transformer无法在过程中利用互补的立体信息；其二，Transformer的性能通常依赖于充足的数据，而常见立体图像超分辨率算法缺乏这一条件。为解决这些问题，我们提出了一种混合Transformer与CNN注意力网络（HTCAN），该网络采用基于Transformer的网络进行单图像增强，并利用基于CNN的网络进行立体信息融合。此外，我们采用多图块训练策略和更大的窗口尺寸，以激活更多输入像素用于超分辨率。我们还重新审视了其他先进技术，如数据增强、数据集成和模型集成，以减少过拟合和数据偏差。最终，我们的方法在NTIRE 2023立体图像超分辨率挑战赛第一赛道中以23.90dB的分数获胜。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日