Segment Anything Model (SAM) has attracted significant attention recently, due to its impressive performance on various downstream tasks in a zero-short manner. Computer vision (CV) area might follow the natural language processing (NLP) area to embark on a path from task-specific vision models toward foundation models. However, deep vision models are widely recognized as vulnerable to adversarial examples, which fool the model to make wrong predictions with imperceptible perturbation. Such vulnerability to adversarial attacks causes serious concerns when applying deep models to security-sensitive applications. Therefore, it is critical to know whether the vision foundation model SAM can also be fooled by adversarial attacks. To the best of our knowledge, our work is the first of its kind to conduct a comprehensive investigation on how to attack SAM with adversarial examples. With the basic attack goal set to mask removal, we investigate the adversarial robustness of SAM in the full white-box setting and transfer-based black-box settings. Beyond the basic goal of mask removal, we further investigate and find that it is possible to generate any desired mask by the adversarial attack.
翻译:分割一切模型(SAM)凭借其在多种下游任务中零样本方式下的卓越性能,近期引起了广泛关注。计算机视觉领域可能正追随自然语言处理领域的步伐,从任务特定视觉模型迈向基础模型。然而,深度视觉模型被广泛认为易受对抗样本攻击,这类攻击通过难以察觉的扰动使模型做出错误预测。这种对对抗攻击的脆弱性在将深度模型应用于安全敏感型场景时引发了严重担忧。因此,探究视觉基础模型SAM是否也会被对抗样本攻击所欺骗至关重要。据我们所知,本研究首次系统性地探究了如何利用对抗样本攻击SAM。以基本攻击目标设定为掩码移除,我们在完全白盒设置和基于迁移的黑盒设置下考察了SAM的对抗鲁棒性。在基本掩码移除目标之外,我们进一步发现通过对抗攻击生成任意目标掩码是可行的。