NAFRSSR: a Lightweight Recursive Network for Efficient Stereo Image Super-Resolution

Stereo image super-resolution (SR) refers to the reconstruction of a high-resolution (HR) image from a pair of low-resolution (LR) images as typically captured by a dual-camera device. To enhance the quality of SR images, most previous studies focused on increasing the number and size of feature maps and introducing complex and computationally intensive structures, resulting in models with high computational complexity. Here, we propose a simple yet efficient stereo image SR model called NAFRSSR, which is modified from the previous state-of-the-art model NAFSSR by introducing recursive connections and lightweighting the constituent modules. Our NAFRSSR model is composed of nonlinear activation free and group convolution-based blocks (NAFGCBlocks) and depth-separated stereo cross attention modules (DSSCAMs). The NAFGCBlock improves feature extraction and reduces number of parameters by removing the simple channel attention mechanism from NAFBlock and using group convolution. The DSSCAM enhances feature fusion and reduces number of parameters by replacing 1x1 pointwise convolution in SCAM with weight-shared 3x3 depthwise convolution. Besides, we propose to incorporate trainable edge detection operator into NAFRSSR to further improve the model performance. Four variants of NAFRSSR with different sizes, namely, NAFRSSR-Mobile (NAFRSSR-M), NAFRSSR-Tiny (NAFRSSR-T), NAFRSSR-Super (NAFRSSR-S) and NAFRSSR-Base (NAFRSSR-B) are designed, and they all exhibit fewer parameters, higher PSNR/SSIM, and faster speed than the previous state-of-the-art models. In particular, to the best of our knowledge, NAFRSSR-M is the lightest (0.28M parameters) and fastest (50 ms inference time) model achieving an average PSNR/SSIM as high as 24.657 dB/0.7622 on the benchmark datasets. Codes and models will be released at https://github.com/JNUChenYiHong/NAFRSSR.

翻译：立体图像超分辨率（Stereo Image SR）是指从双摄像头设备通常捕获的一对低分辨率（LR）图像中重建高分辨率（HR）图像。为提升SR图像质量，以往研究多聚焦于增加特征图的数量与尺寸，并引入复杂且计算密集的结构，导致模型计算复杂度较高。本文提出一种简单高效的立体图像SR模型NAFRSSR，该模型通过引入递归连接并轻量化各组成模块，对先前最先进的NAFSSR模型进行改进。NAFRSSR模型由基于无非线性激活与分组卷积的模块（NAFGCBlock）和深度分离立体交叉注意力模块（DSSCAM）构成。NAFGCBlock通过移除NAFBlock中的简单通道注意力机制并采用分组卷积，提升了特征提取能力并减少了参数量。DSSCAM通过将SCAM中的1x1逐点卷积替换为权重共享的3x3深度卷积，增强了特征融合并降低了参数量。此外，我们提出将可训练边缘检测算子融入NAFRSSR，以进一步提升模型性能。本文设计了四种不同规模的NAFRSSR变体，即NAFRSSR-Mobile（NAFRSSR-M）、NAFRSSR-Tiny（NAFRSSR-T）、NAFRSSR-Super（NAFRSSR-S）和NAFRSSR-Base（NAFRSSR-B），这些变体在参数量、PSNR/SSIM及推理速度上均优于先前最先进模型。特别地，据我们所知，NAFRSSR-M是最轻量（0.28M参数）且最快（50ms推理时间）的模型，在基准数据集上平均PSNR/SSIM高达24.657 dB/0.7622。代码与模型将发布于https://github.com/JNUChenYiHong/NAFRSSR。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日