The success of deep neural networks for pan-sharpening is commonly in a form of black box, lacking transparency and interpretability. To alleviate this issue, we propose a novel model-driven deep unfolding framework with image reasoning prior tailored for the pan-sharpening task. Different from existing unfolding solutions that deliver the proximal operator networks as the uncertain and vague priors, our framework is motivated by the content reasoning ability of masked autoencoders (MAE) with insightful designs. Specifically, the pre-trained MAE with spatial masking strategy, acting as intrinsic reasoning prior, is embedded into unfolding architecture. Meanwhile, the pre-trained MAE with spatial-spectral masking strategy is treated as the regularization term within loss function to constrain the spatial-spectral consistency. Such designs penetrate the image reasoning prior into deep unfolding networks while improving its interpretability and representation capability. The uniqueness of our framework is that the holistic learning process is explicitly integrated with the inherent physical mechanism underlying the pan-sharpening task. Extensive experiments on multiple satellite datasets demonstrate the superiority of our method over the existing state-of-the-art approaches. Code will be released at \url{https://manman1995.github.io/}.
翻译:深度神经网络在全色锐化任务中的成功通常以黑盒形式呈现,缺乏透明度和可解释性。为缓解这一问题,我们提出了一种面向全色锐化任务的、基于图像推理先验的新型模型驱动深度展开框架。与现有将近端算子网络作为不确定且模糊先验的展开方案不同,我们的框架受到掩码自编码器(MAE)内容推理能力的启发,并进行了巧妙设计。具体而言,采用空间掩码策略的预训练MAE作为内在推理先验嵌入展开架构中,同时,采用空间-光谱掩码策略的预训练MAE作为损失函数中的正则化项,以约束空间-光谱一致性。这些设计将图像推理先渗透至深度展开网络,同时提升了网络的可解释性与表示能力。本框架的独特性在于,整体学习过程显式地与全色锐化任务的内在物理机制相融合。在多个卫星数据集上的大量实验表明,我们的方法优于现有最先进方法。代码将在 \url{https://manman1995.github.io/} 开源。