Deep learning-based techniques for the analysis of multimodal remote sensing data have become popular due to their ability to effectively integrate complementary spatial, spectral, and structural information from different sensors. Recently, denoising diffusion probabilistic models (DDPMs) have attracted attention in the remote sensing community due to their powerful ability to capture robust and complex spatial-spectral distributions. However, pre-training multimodal DDPMs may result in modality imbalance, and effectively leveraging diffusion features to guide complementary diversity feature extraction remains an open question. To address these issues, this paper proposes a balanced diffusion-guided fusion (BDGF) framework that leverages multimodal diffusion features to guide a multi-branch network for land-cover classification. Specifically, we propose an adaptive modality masking strategy to encourage the DDPMs to obtain a modality-balanced rather than spectral image-dominated data distribution. Subsequently, these diffusion features hierarchically guide feature extraction among CNN, Mamba, and transformer networks by integrating feature fusion, group channel attention, and cross-attention mechanisms. Finally, a mutual learning strategy is developed to enhance inter-branch collaboration by aligning the probability entropy and feature similarity of individual subnetworks. Extensive experiments on four multimodal remote sensing datasets demonstrate that the proposed method achieves superior classification performance. The code is available at https://github.com/HaoLiu-XDU/BDGF.
翻译:基于深度学习的多模态遥感数据分析技术因其能够有效整合来自不同传感器的互补空间、光谱及结构信息而日益普及。近年来,去噪扩散概率模型凭借其捕获鲁棒且复杂的空间-光谱分布的强大能力,在遥感领域引起了广泛关注。然而,预训练多模态DDPM可能导致模态不平衡,且如何有效利用扩散特征来指导互补多样性特征提取仍是一个开放性问题。为解决这些问题,本文提出一种平衡扩散引导融合框架,该框架利用多模态扩散特征来指导一个多分支网络进行土地覆盖分类。具体而言,我们提出一种自适应模态掩蔽策略,以促使DDPM获得模态平衡而非光谱图像主导的数据分布。随后,这些扩散特征通过整合特征融合、组通道注意力及交叉注意力机制,在CNN、Mamba及Transformer网络之间分层指导特征提取。最后,我们开发了一种互学习策略,通过对齐各子网络的概率熵与特征相似性来增强分支间的协作。在四个多模态遥感数据集上的大量实验表明,所提方法取得了卓越的分类性能。代码发布于 https://github.com/HaoLiu-XDU/BDGF。