Object detection and single image super-resolution are classic problems in computer vision (CV). The object detection task aims to recognize the objects in input images, while the image restoration task aims to reconstruct high quality images from given low quality images. In this paper, a two-stage framework for object detection and image restoration is proposed. The first stage uses YOLO series algorithms to complete the object detection and then performs image cropping. In the second stage, this work improves Swin Transformer and uses the new proposed algorithm to connect the Swin Transformer layer to design a new neural network architecture. We name the newly proposed network for image restoration SwinOIR. This work compares the model performance of different versions of YOLO detection algorithms on MS COCO dataset and Pascal VOC dataset, demonstrating the suitability of different YOLO network models for the first stage of the framework in different scenarios. For image super-resolution task, it compares the model performance of using different methods of connecting Swin Transformer layers and design different sizes of SwinOIR for use in different life scenarios. Our implementation code is released at https://github.com/Rubbbbbbbbby/SwinOIR.
翻译:目标检测与单图像超分辨率是计算机视觉(CV)中的经典问题。目标检测任务旨在识别输入图像中的物体,而图像复原任务旨在从给定的低质量图像中重建高质量图像。本文提出了一种用于目标检测与图像复原的两阶段框架。第一阶段使用YOLO系列算法完成目标检测并进行图像裁剪;第二阶段对Swin Transformer进行改进,并采用新提出的算法连接Swin Transformer层,设计了一种新型神经网络架构。我们将新提出的图像复原网络命名为SwinOIR。本研究在MS COCO数据集和Pascal VOC数据集上比较了不同版本YOLO检测算法的模型性能,证明了不同YOLO网络模型在不同场景下适用于框架第一阶段的特性。在图像超分辨率任务中,本文比较了使用不同Swin Transformer层连接方法的模型性能,并设计了不同规模的SwinOIR以适应不同生活场景。我们的实现代码已开源:https://github.com/Rubbbbbbbbby/SwinOIR。