Morphological methods play a crucial role in remote sensing image processing, due to their ability to capture and preserve small structural details. However, most of the existing deep learning models for semantic segmentation are based on the encoder-decoder architecture including U-net and Segment Anything Model (SAM), where the downsampling process tends to discard fine details. In this paper, we propose a new approach that integrates learnable morphological skeleton prior into deep neural networks using the variational method. To address the difficulty in backpropagation in neural networks caused by the non-differentiability presented in classical morphological operations, we provide a smooth representation of the morphological skeleton and design a variational segmentation model integrating morphological skeleton prior by employing operator splitting and dual methods. Then, we integrate this model into the network architecture of SAM, which is achieved by adding a token to mask decoder and modifying the final sigmoid layer, ensuring the final segmentation results preserve the skeleton structure as much as possible. Experimental results on remote sensing datasets, including buildings and roads, demonstrate that our method outperforms the original SAM on slender object segmentation and exhibits better generalization capability.
翻译:形态学方法在遥感图像处理中扮演着关键角色,因其能够捕捉并保留细微的结构细节。然而,现有大多数用于语义分割的深度学习模型(包括U-net和Segment Anything Model (SAM))均基于编码器-解码器架构,其下采样过程往往会丢失精细细节。本文提出一种新方法,通过变分方法将可学习的形态骨架先验整合到深度神经网络中。针对经典形态学操作中不可微性导致的神经网络反向传播困难,我们给出了形态骨架的光滑表示,并运用算子分裂与对偶方法设计了一个融合形态骨架先验的变分分割模型。随后,我们将该模型集成到SAM的网络架构中,具体通过向掩码解码器添加一个标记并修改最终的sigmoid层来实现,从而确保最终的分割结果尽可能保留骨架结构。在包含建筑物与道路的遥感数据集上的实验结果表明,本方法在细长目标分割任务上优于原始SAM,并展现出更好的泛化能力。