Semantic change detection (SCD) aims to simultaneously locate land-cover changes and identify semantic categories before and after transition. However, existing methods suffer from insufficient cross-temporal alignment, weak multi-scale representation, and poor robustness to pseudo-changes caused by illumination, season, and registration noise. To address these issues, we propose a novel end-to-end semantic change detection network named SemDINO, which integrates a dual-branch encoder, multi-scale temporal interaction, semantic purification, change enhancement, and decoupled multi-task prediction into a unified framework. Specifically, we construct a dual-branch encoder that combines a CNN backbone and frozen DINOv3 features via gated pyramid fusion, enabling rich multi-scale semantic representation. Then, a multi-scale temporal bidirectional transformer interaction (M-TBTT) module is proposed to achieve global cross-temporal feature alignment and information interaction. To further enhance genuine changes and suppress pseudo-variations, we introduce semantic purification (SCP), bidirectional change enhancement (BiChangeEnhance), and multi-scale change enhancement (MCE) modules collaboratively. Finally, a multi-branch CD prediction head is designed to jointly output binary change mask, bi-temporal semantic maps, and edge constraint. Extensive experiments on public remote sensing CD datasets demonstrate that SemDINO achieves superior performance and generalization ability against state-of-the-art methods, especially in complex scenarios with interference factors.
翻译:语义变化检测旨在同时定位土地覆盖变化并识别变化前后的语义类别。然而,现有方法存在跨时间对齐不足、多尺度表征能力弱以及对光照、季节和配准噪声引起的伪变化鲁棒性差等问题。为解决这些挑战,我们提出了一种名为SemDINO的新型端到端语义变化检测网络,该网络将双分支编码器、多尺度时间交互、语义净化、变化增强以及解耦多任务预测整合至统一框架中。具体而言,我们构建了一个结合CNN主干与冻结DINOv3特征的双分支编码器,通过门控金字塔融合实现丰富的多尺度语义表征。随后提出多尺度时间双向Transformer交互模块,以实现全局跨时间特征对齐与信息交互。为进一步增强真实变化并抑制伪变化,我们协同引入语义净化模块、双向变化增强模块与多尺度变化增强模块。最后设计多分支变化检测预测头,联合输出二值变化掩膜、双时相语义图及边缘约束。在公开遥感变化检测数据集上的大量实验表明,SemDINO在复杂干扰场景下相比现有最优方法展现出更优的性能与泛化能力。