Semantic Change Detection (SCD) refers to the task of simultaneously extracting the changed areas and the semantic categories (before and after the changes) in Remote Sensing Images (RSIs). This is more meaningful than Binary Change Detection (BCD) since it enables detailed change analysis in the observed areas. Previous works established triple-branch Convolutional Neural Network (CNN) architectures as the paradigm for SCD. However, it remains challenging to exploit semantic information with a limited amount of change samples. In this work, we investigate to jointly consider the spatio-temporal dependencies to improve the accuracy of SCD. First, we propose a Semantic Change Transformer (SCanFormer) to explicitly model the 'from-to' semantic transitions between the bi-temporal RSIs. Then, we introduce a semantic learning scheme to leverage the spatio-temporal constraints, which are coherent to the SCD task, to guide the learning of semantic changes. The resulting network (SCanNet) significantly outperforms the baseline method in terms of both detection of critical semantic changes and semantic consistency in the obtained bi-temporal results. It achieves the SOTA accuracy on two benchmark datasets for the SCD.
翻译:语义变化检测(SCD)是指同时提取遥感图像(RSI)中变化区域及其变化前后语义类别的任务。相较于二元变化检测(BCD),该任务更具意义,因为它能够对观测区域进行详细的变更分析。先前的研究将三支卷积神经网络(CNN)架构确立为SCD的范式。然而,在有限变化样本条件下有效利用语义信息仍具挑战性。本文探索通过联合考虑时空依赖性来提升SCD精度。首先,我们提出语义变化Transformer(SCanFormer)以显式建模双时相RSI间的"从-到"语义转换。随后,引入语义学习方案,利用与SCD任务一致的时空约束指导语义变化学习。所构建的网络(SCanNet)在关键语义变化检测及双时相结果的语义一致性方面显著优于基线方法,并在两个SCD基准数据集上达到了最先进的精度。