Recently, deep learning-based salient object detection (SOD) in optical remote sensing images (ORSIs) have achieved significant breakthroughs. We observe that existing ORSIs-SOD methods consistently center around optimizing pixel features in the spatial domain, progressively distinguishing between backgrounds and objects. However, pixel information represents local attributes, which are often correlated with their surrounding context. Even with strategies expanding the local region, spatial features remain biased towards local characteristics, lacking the ability of global perception. To address this problem, we introduce the Fourier transform that generate global frequency features and achieve an image-size receptive field. To be specific, we propose a novel United Domain Cognition Network (UDCNet) to jointly explore the global-local information in the frequency and spatial domains. Technically, we first design a frequency-spatial domain transformer block that mutually amalgamates the complementary local spatial and global frequency features to strength the capability of initial input features. Furthermore, a dense semantic excavation module is constructed to capture higher-level semantic for guiding the positioning of remote sensing objects. Finally, we devise a dual-branch joint optimization decoder that applies the saliency and edge branches to generate high-quality representations for predicting salient objects. Experimental results demonstrate the superiority of the proposed UDCNet method over 24 state-of-the-art models, through extensive quantitative and qualitative comparisons in three widely-used ORSIs-SOD datasets. The source code is available at: \href{https://github.com/CSYSI/UDCNet}{\color{blue} https://github.com/CSYSI/UDCNet}.
翻译:近年来,基于深度学习的光学遥感图像显著目标检测取得了重大突破。我们观察到,现有的光学遥感图像显著目标检测方法始终围绕优化空间域中的像素特征展开,逐步区分背景与目标。然而,像素信息代表局部属性,通常与其周围上下文相关。即使采用扩展局部区域的策略,空间特征仍然偏向局部特性,缺乏全局感知能力。为解决此问题,我们引入傅里叶变换来生成全局频率特征并实现图像尺寸的感受野。具体而言,我们提出了一种新颖的联合域认知网络,以在频率域和空间域中联合探索全局-局部信息。技术上,我们首先设计了一个频率-空间域Transformer块,以相互融合互补的局部空间特征和全局频率特征,从而增强初始输入特征的能力。此外,我们构建了一个密集语义挖掘模块,以捕获更高层次的语义信息来指导遥感目标的定位。最后,我们设计了一个双分支联合优化解码器,应用显著性和边缘分支来生成高质量的表征以预测显著目标。实验结果表明,通过在三个广泛使用的光学遥感图像显著目标检测数据集上进行广泛的定量和定性比较,所提出的UDCNet方法优于24种最先进的模型。源代码位于:\href{https://github.com/CSYSI/UDCNet}{\color{blue} https://github.com/CSYSI/UDCNet}。