Camouflaged object detection is a challenging task that aims to identify objects that are highly similar to their background. Due to the powerful noise-to-image denoising capability of denoising diffusion models, in this paper, we propose a diffusion-based framework for camouflaged object detection, termed diffCOD, a new framework that considers the camouflaged object segmentation task as a denoising diffusion process from noisy masks to object masks. Specifically, the object mask diffuses from the ground-truth masks to a random distribution, and the designed model learns to reverse this noising process. To strengthen the denoising learning, the input image prior is encoded and integrated into the denoising diffusion model to guide the diffusion process. Furthermore, we design an injection attention module (IAM) to interact conditional semantic features extracted from the image with the diffusion noise embedding via the cross-attention mechanism to enhance denoising learning. Extensive experiments on four widely used COD benchmark datasets demonstrate that the proposed method achieves favorable performance compared to the existing 11 state-of-the-art methods, especially in the detailed texture segmentation of camouflaged objects. Our code will be made publicly available at: https://github.com/ZNan-Chen/diffCOD.
翻译:伪装目标检测是一项极具挑战性的任务,旨在识别与背景高度相似的目标。得益于去噪扩散模型强大的噪声-图像去噪能力,本文提出一种基于扩散的伪装目标检测框架,称为diffCOD,该新框架将伪装目标分割任务视为从噪声掩码到目标掩码的去噪扩散过程。具体而言,目标掩码从真实掩码扩散到随机分布,而所设计的模型学习逆转这一加噪过程。为加强去噪学习,输入图像先验被编码并集成到去噪扩散模型中,以引导扩散过程。此外,我们设计了一个注入注意力模块(IAM),通过交叉注意力机制将图像中提取的条件语义特征与扩散噪声嵌入进行交互,从而增强去噪学习。在四个广泛使用的COD基准数据集上的大量实验表明,与现有11种最先进方法相比,所提方法在伪装目标的细节纹理分割方面取得了优异性能。我们的代码将公开于:https://github.com/ZNan-Chen/diffCOD。