Swin2-MoSE: A New Single Image Super-Resolution Model for Remote Sensing

Due to the limitations of current optical and sensor technologies and the high cost of updating them, the spectral and spatial resolution of satellites may not always meet desired requirements. For these reasons, Remote-Sensing Single-Image Super-Resolution (RS-SISR) techniques have gained significant interest. In this paper, we propose Swin2-MoSE model, an enhanced version of Swin2SR. Our model introduces MoE-SM, an enhanced Mixture-of-Experts (MoE) to replace the Feed-Forward inside all Transformer block. MoE-SM is designed with Smart-Merger, and new layer for merging the output of individual experts, and with a new way to split the work between experts, defining a new per-example strategy instead of the commonly used per-token one. Furthermore, we analyze how positional encodings interact with each other, demonstrating that per-channel bias and per-head bias can positively cooperate. Finally, we propose to use a combination of Normalized-Cross-Correlation (NCC) and Structural Similarity Index Measure (SSIM) losses, to avoid typical MSE loss limitations. Experimental results demonstrate that Swin2-MoSE outperforms SOTA by up to 0.377 ~ 0.958 dB (PSNR) on task of 2x, 3x and 4x resolution-upscaling (Sen2Venus and OLI2MSI datasets). We show the efficacy of Swin2-MoSE, applying it to a semantic segmentation task (SeasoNet dataset). Code and pretrained are available on https://github.com/IMPLabUniPr/swin2-mose/tree/official_code

翻译：受限于当前光学与传感器技术及其高昂的更新成本，卫星的光谱与空间分辨率往往难以满足理想需求。因此，遥感单图像超分辨率（RS-SISR）技术获得了广泛关注。本文提出Swin2-MoSE模型，该模型是Swin2SR的增强版本。我们引入增强型混合专家网络（MoE）模块MoE-SM，用以替代所有Transformer模块中的前馈网络层。MoE-SM采用Smart-Merger设计，该新型层可合并各专家输出，并创新性地定义了一种基于逐个样本（per-example）的任务分配策略，替代了传统的逐词符（per-token）方式。此外，我们分析了位置编码间的交互机制，证明了逐通道偏置与逐头偏置具有正向协同效应。最后，我们提出联合使用归一化互相关（NCC）损失与结构相似性指数（SSIM）损失，以规避传统均方误差（MSE）损失函数的局限。实验结果表明，在2倍、3倍及4倍分辨率提升任务中（基于Sen2Venus和OLI2MSI数据集），Swin2-MoSE相比当前最优方法（SOTA）的PSNR指标提升了0.377~0.958 dB。我们通过在语义分割任务（SeasoNet数据集）上的应用验证了Swin2-MoSE的有效性。代码与预训练模型已发布于https://github.com/IMPLabUniPr/swin2-mose/tree/official_code

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日