Segment Anything Model (SAM) has recently gained much attention for its outstanding generalization to unseen data and tasks. Despite its promising prospect, the vulnerabilities of SAM, especially to universal adversarial perturbation (UAP) have not been thoroughly investigated yet. In this paper, we propose DarkSAM, the first prompt-free universal attack framework against SAM, including a semantic decoupling-based spatial attack and a texture distortion-based frequency attack. We first divide the output of SAM into foreground and background. Then, we design a shadow target strategy to obtain the semantic blueprint of the image as the attack target. DarkSAM is dedicated to fooling SAM by extracting and destroying crucial object features from images in both spatial and frequency domains. In the spatial domain, we disrupt the semantics of both the foreground and background in the image to confuse SAM. In the frequency domain, we further enhance the attack effectiveness by distorting the high-frequency components (i.e., texture information) of the image. Consequently, with a single UAP, DarkSAM renders SAM incapable of segmenting objects across diverse images with varying prompts. Experimental results on four datasets for SAM and its two variant models demonstrate the powerful attack capability and transferability of DarkSAM.
翻译:Segment Anything Model (SAM) 近期因其对未见数据和任务的出色泛化能力而备受关注。尽管前景广阔,但SAM的脆弱性,尤其是对通用对抗扰动(UAP)的敏感性,尚未得到深入研究。本文提出DarkSAM,首个针对SAM的无提示通用攻击框架,包含基于语义解耦的空间攻击和基于纹理失真的频率攻击。我们首先将SAM的输出划分为前景和背景,随后设计阴影目标策略以获取图像的语义蓝图作为攻击目标。DarkSAM通过在空间域和频率域中提取并破坏图像的关键物体特征,致力于欺骗SAM。在空间域中,我们破坏图像前景与背景的语义信息以混淆SAM;在频率域中,我们通过扭曲图像的高频分量(即纹理信息)进一步增强攻击效果。因此,仅需单个UAP,DarkSAM即可使SAM无法在多样化图像中响应不同提示进行物体分割。在四个数据集上对SAM及其两个变体模型的实验结果表明,DarkSAM具有强大的攻击能力和可迁移性。