Camouflaged object detection (COD), aiming to segment camouflaged objects which exhibit similar patterns with the background, is a challenging task. Most existing works are dedicated to establishing specialized modules to identify camouflaged objects with complete and fine details, while the boundary can not be well located for the lack of object-related semantics. In this paper, we propose a novel ``pre-train, adapt and detect" paradigm to detect camouflaged objects. By introducing a large pre-trained model, abundant knowledge learned from massive multi-modal data can be directly transferred to COD. A lightweight parallel adapter is inserted to adjust the features suitable for the downstream COD task. Extensive experiments on four challenging benchmark datasets demonstrate that our method outperforms existing state-of-the-art COD models by large margins. Moreover, we design a multi-task learning scheme for tuning the adapter to exploit the shareable knowledge across different semantic classes. Comprehensive experimental results showed that the generalization ability of our model can be substantially improved with multi-task adapter initialization on source tasks and multi-task adaptation on target tasks.
翻译:伪装目标检测(COD)旨在分割与背景具有相似模式的目标物体,是一项具有挑战性的任务。现有大多数研究致力于设计专用模块来识别具有完整精细细节的伪装目标,但由于缺乏与目标相关的语义信息,其边界往往难以准确定位。本文提出一种新颖的“预训练-适配-检测”范式以实现伪装目标检测。通过引入大规模预训练模型,可将从海量多模态数据中学习到的丰富知识直接迁移至COD任务。插入轻量级并行适配器以调整特征,使其适应下游COD任务。在四个具有挑战性的基准数据集上的大量实验表明,我们的方法大幅优于现有最先进的COD模型。此外,我们设计了多任务学习方案来微调配器,以挖掘不同语义类别间的可共享知识。综合实验结果表明,通过源任务上的多任务适配器初始化及目标任务上的多任务适配,模型的泛化能力得以显著提升。