Segment Anything Model (SAM) has attracted significant attention recently, due to its impressive performance on various downstream tasks in a zero-short manner. Computer vision (CV) area might follow the natural language processing (NLP) area to embark on a path from task-specific vision models toward foundation models. However, previous task-specific models are widely recognized as vulnerable to adversarial examples, which fool the model to make wrong predictions with imperceptible perturbation. Such vulnerability to adversarial attacks causes serious concerns when applying deep models to security-sensitive applications. Therefore, it is critical to know whether the vision foundation model SAM can also be easily fooled by adversarial attacks. To the best of our knowledge, our work is the first of its kind to conduct a comprehensive investigation on how to attack SAM with adversarial examples. Specifically, we find that SAM is vulnerable to white-box attacks while maintaining robustness to some extent in the black-box setting. This is an ongoing project and more results and findings will be updated soon through https://github.com/chenshuang-zhang/attack-sam.
翻译:分段任意模型(SAM)因其在零样本方式下对各类下游任务展现出的卓越性能,近期受到广泛关注。计算机视觉(CV)领域可能正追随自然语言处理(NLP)领域的脚步,从任务特定视觉模型迈向基础模型的发展路径。然而,以往的任务特定模型被普遍认为易受对抗样本攻击——这些攻击通过难以察觉的扰动使模型产生错误预测。在将深度学习模型应用于安全敏感场景时,这种对抗攻击的脆弱性引发了严重担忧。因此,探究视觉基础模型SAM是否同样易被对抗攻击所欺骗至关重要。据我们所知,本研究首次系统性地开展了利用对抗样本攻击SAM的全面探索。具体而言,我们发现在白盒攻击场景下SAM存在脆弱性,而在黑盒设置中模型则保持一定程度的鲁棒性。这是一个进行中的项目,更多结果与发现将通过 https://github.com/chenshuang-zhang/attack-sam 持续更新。