Stereo image super-resolution (SR) refers to the reconstruction of a high-resolution (HR) image from a pair of low-resolution (LR) images as typically captured by a dual-camera device. To enhance the quality of SR images, most previous studies focused on increasing the number and size of feature maps and introducing complex and computationally intensive structures, resulting in models with high computational complexity. Here, we propose a simple yet efficient stereo image SR model called NAFRSSR, which is modified from the previous state-of-the-art model NAFSSR by introducing recursive connections and lightweighting the constituent modules. Our NAFRSSR model is composed of nonlinear activation free and group convolution-based blocks (NAFGCBlocks) and depth-separated stereo cross attention modules (DSSCAMs). The NAFGCBlock improves feature extraction and reduces number of parameters by removing the simple channel attention mechanism from NAFBlock and using group convolution. The DSSCAM enhances feature fusion and reduces number of parameters by replacing 1x1 pointwise convolution in SCAM with weight-shared 3x3 depthwise convolution. Besides, we propose to incorporate trainable edge detection operator into NAFRSSR to further improve the model performance. Four variants of NAFRSSR with different sizes, namely, NAFRSSR-Mobile (NAFRSSR-M), NAFRSSR-Tiny (NAFRSSR-T), NAFRSSR-Super (NAFRSSR-S) and NAFRSSR-Base (NAFRSSR-B) are designed, and they all exhibit fewer parameters, higher PSNR/SSIM, and faster speed than the previous state-of-the-art models. In particular, to the best of our knowledge, NAFRSSR-M is the lightest (0.28M parameters) and fastest (50 ms inference time) model achieving an average PSNR/SSIM as high as 24.657 dB/0.7622 on the benchmark datasets. Codes and models will be released at https://github.com/JNUChenYiHong/NAFRSSR.
翻译:立体图像超分辨率(Stereo Image SR)是指从双摄像头设备通常捕获的一对低分辨率(LR)图像中重建高分辨率(HR)图像。为提升SR图像质量,以往研究多聚焦于增加特征图的数量与尺寸,并引入复杂且计算密集的结构,导致模型计算复杂度较高。本文提出一种简单高效的立体图像SR模型NAFRSSR,该模型通过引入递归连接并轻量化各组成模块,对先前最先进的NAFSSR模型进行改进。NAFRSSR模型由基于无非线性激活与分组卷积的模块(NAFGCBlock)和深度分离立体交叉注意力模块(DSSCAM)构成。NAFGCBlock通过移除NAFBlock中的简单通道注意力机制并采用分组卷积,提升了特征提取能力并减少了参数量。DSSCAM通过将SCAM中的1x1逐点卷积替换为权重共享的3x3深度卷积,增强了特征融合并降低了参数量。此外,我们提出将可训练边缘检测算子融入NAFRSSR,以进一步提升模型性能。本文设计了四种不同规模的NAFRSSR变体,即NAFRSSR-Mobile(NAFRSSR-M)、NAFRSSR-Tiny(NAFRSSR-T)、NAFRSSR-Super(NAFRSSR-S)和NAFRSSR-Base(NAFRSSR-B),这些变体在参数量、PSNR/SSIM及推理速度上均优于先前最先进模型。特别地,据我们所知,NAFRSSR-M是最轻量(0.28M参数)且最快(50ms推理时间)的模型,在基准数据集上平均PSNR/SSIM高达24.657 dB/0.7622。代码与模型将发布于https://github.com/JNUChenYiHong/NAFRSSR。