In this paper, we consider the problem of referring camouflaged object detection (Ref-COD), a new task that aims to segment specified camouflaged objects based on some form of reference, e.g., image, text. We first assemble a large-scale dataset, called R2C7K, which consists of 7K images covering 64 object categories in real-world scenarios. Then, we develop a simple but strong dual-branch framework, dubbed R2CNet, with a reference branch learning common representations from the referring information and a segmentation branch identifying and segmenting camouflaged objects under the guidance of the common representations. In particular, we design a Referring Mask Generation module to generate pixel-level prior mask and a Referring Feature Enrichment module to enhance the capability of identifying camouflaged objects. Extensive experiments show the superiority of our Ref-COD methods over their COD counterparts in segmenting specified camouflaged objects and identifying the main body of target objects. Our code and dataset are publicly available at https://github.com/zhangxuying1004/RefCOD.
翻译:本文探讨了参考伪装目标检测(Ref-COD)问题,这是一项基于某种形式参考(如图像、文本)来分割指定伪装目标的新任务。我们首先构建了一个大规模数据集R2C7K,包含7000张图像,涵盖真实场景中的64个目标类别。接着,我们设计了一个简单而强大的双分支框架R2CNet,其中参考分支从参考信息中学习通用表示,分割分支则在通用表示指导下识别并分割伪装目标。特别地,我们提出了参考掩码生成模块以生成像素级先验掩码,以及参考特征增强模块以提升伪装目标的识别能力。大量实验表明,我们的Ref-COD方法在分割指定伪装目标和识别目标主体方面优于传统的COD方法。我们的代码和数据集已公开于https://github.com/zhangxuying1004/RefCOD。