Significant progress has been made in the field of super-resolution (SR), yet many convolutional neural networks (CNNs) based SR models primarily focus on restoring high-frequency details, often overlooking crucial low-frequency contour information. Transformer-based SR methods, while incorporating global structural details, frequently come with an abundance of parameters, leading to high computational overhead. In this paper, we address these challenges by introducing a Multi-Depth Branches Network (MDBN). This framework extends the ResNet architecture by integrating an additional branch that captures vital structural characteristics of images. Our proposed multi-depth branches module (MDBM) involves the stacking of convolutional kernels of identical size at varying depths within distinct branches. By conducting a comprehensive analysis of the feature maps, we observe that branches with differing depths can extract contour and detail information respectively. By integrating these branches, the overall architecture can preserve essential low-frequency semantic structural information during the restoration of high-frequency visual elements, which is more closely with human visual cognition. Compared to GoogLeNet-like models, our basic multi-depth branches structure has fewer parameters, higher computational efficiency, and improved performance. Our model outperforms state-of-the-art (SOTA) lightweight SR methods with less inference time. Our code is available at https://github.com/thy960112/MDBN
翻译:超分辨率(SR)领域已取得显著进展,然而多数基于卷积神经网络(CNN)的SR模型主要聚焦于恢复高频细节,常常忽略关键的低频轮廓信息。基于Transformer的SR方法虽能融入全局结构细节,但往往伴随大量参数,导致高计算开销。本文通过引入多深度分支网络(MDBN)来应对这些挑战。该框架扩展了ResNet架构,集成额外分支以捕获图像的关键结构特征。我们提出的多深度分支模块(MDBM)涉及在不同深度的分支中堆叠相同尺寸的卷积核。通过对特征图进行全面分析,我们观察到深度不同的分支可分别提取轮廓和细节信息。通过整合这些分支,整体架构能在恢复高频视觉元素的同时保留重要的低频语义结构信息,这与人类视觉认知更为契合。与类似GoogLeNet的模型相比,我们的基础多深度分支结构参数更少、计算效率更高且性能更优。我们的模型以更短的推理时间超越了最先进的轻量级SR方法。代码已开源至https://github.com/thy960112/MDBN