This research focuses on the discovery and localization of hidden objects in the wild and serves unmanned systems. Through empirical analysis, infrared and visible image fusion (IVIF) enables hard-to-find objects apparent, whereas multimodal salient object detection (SOD) accurately delineates the precise spatial location of objects within the picture. Their common characteristic of seeking complementary cues from different source images motivates us to explore the collaborative relationship between Fusion and Salient object detection tasks on infrared and visible images via an Interactively Reinforced multi-task paradigm for the first time, termed IRFS. To the seamless bridge of multimodal image fusion and SOD tasks, we specifically develop a Feature Screening-based Fusion subnetwork (FSFNet) to screen out interfering features from source images, thereby preserving saliency-related features. After generating the fused image through FSFNet, it is then fed into the subsequent Fusion-Guided Cross-Complementary SOD subnetwork (FC$^2$Net) as the third modality to drive the precise prediction of the saliency map by leveraging the complementary information derived from the fused image. In addition, we develop an interactive loop learning strategy to achieve the mutual reinforcement of IVIF and SOD tasks with a shorter training period and fewer network parameters. Comprehensive experiment results demonstrate that the seamless bridge of IVIF and SOD mutually enhances their performance, and highlights their superiority.
翻译:本研究聚焦于野外隐蔽目标的发现与定位,服务于无人系统。通过实证分析发现,红外与可见光图像融合(IVIF)能使难以发现的目标变得可见,而多模态显著目标检测(SOD)则可精确描绘图像中目标的具体空间位置。两者均需从不同源图像中挖掘互补线索的共同特性,促使我们首次提出一种基于交互强化多任务范式来探索红外与可见光图像上融合与显著目标检测任务间的协同关系,并将其命名为IRFS。为实现多模态图像融合与SOD任务的无缝桥接,我们专门开发了基于特征筛选的融合子网络(FSFNet),用以剔除源图像中的干扰特征,从而保留与显著性相关的特征。通过FSFNet生成融合图像后,将其作为第三模态输入后续的融合引导交叉互补SOD子网络(FC$^2$Net),利用融合图像中的互补信息驱动显著图的精确预测。此外,我们开发了一种交互循环学习策略,以更短的训练周期和更少的网络参数实现IVIF与SOD任务的相互增强。综合实验结果表明,IVIF与SOD任务的无缝桥接可相互提升各自性能,并突出其优越性。