In this paper, we consider two challenging issues in reference-based super-resolution (RefSR) for smartphone, (i) how to choose a proper reference image, and (ii) how to learn RefSR in a self-supervised manner. Particularly, we propose a novel self-supervised learning approach for real-world RefSR from observations at dual and multiple camera zooms. Firstly, considering the popularity of multiple cameras in modern smartphones, the more zoomed (telephoto) image can be naturally leveraged as the reference to guide the super-resolution (SR) of the lesser zoomed (ultra-wide) image, which gives us a chance to learn a deep network that performs SR from the dual zoomed observations (DZSR). Secondly, for self-supervised learning of DZSR, we take the telephoto image instead of an additional high-resolution image as the supervision information, and select a center patch from it as the reference to super-resolve the corresponding ultra-wide image patch. To mitigate the effect of the misalignment between ultra-wide low-resolution (LR) patch and telephoto ground-truth (GT) image during training, we first adopt patch-based optical flow alignment and then design an auxiliary-LR to guide the deforming of the warped LR features. To generate visually pleasing results, we present local overlapped sliced Wasserstein loss to better represent the perceptual difference between GT and output in the feature space. During testing, DZSR can be directly deployed to super-solve the whole ultra-wide image with the reference of the telephoto image. In addition, we further take multiple zoomed observations to explore self-supervised RefSR, and present a progressive fusion scheme for the effective utilization of reference images. Experiments show that our methods achieve better quantitative and qualitative performance against state-of-the-arts. Codes are available at https://github.com/cszhilu1998/SelfDZSR_PlusPlus.
翻译:本文针对智能手机中基于参考的超分辨率(RefSR)面临的两个挑战性问题:(i) 如何选择合适的参考图像,(ii) 如何以自监督方式学习RefSR。为此,我们提出了一种新颖的自监督学习方法,利用双焦和多焦相机拍摄的观测图像实现真实世界RefSR。首先,考虑到现代智能手机普遍配备多摄像头,自然可以将变焦更大的(长焦)图像作为参考,引导变焦较小的(超广角)图像的超分辨率(SR),从而有机会学习一种基于双焦观测执行SR的深度网络(DZSR)。其次,针对DZSR的自监督学习,我们采用长焦图像而非额外的高分辨率图像作为监督信息,并从中选取中心块作为参考,对相应的超广角图像块进行超分辨率重建。为减小训练过程中超广角低分辨率(LR)块与长焦真值(GT)图像之间的错位影响,我们首先采用基于块的光流对齐方法,随后设计辅助低分辨率图像(auxiliary-LR)引导变形后LR特征的调整。为生成视觉上令人满意的结果,我们提出了局部重叠切片沃瑟斯坦损失函数,以在特征空间更好地表达GT与输出之间的感知差异。在测试阶段,DZSR可直接部署,利用长焦图像作为参考对整个超广角图像进行超分辨率重建。此外,我们进一步利用多焦观测图像探索自监督RefSR,并提出一种渐进式融合方案以有效利用参考图像。实验表明,我们的方法在定量和定性性能上均优于现有最先进方法。代码已开源:https://github.com/cszhilu1998/SelfDZSR_PlusPlus。