Robustness of SAM: Segment Anything Under Corruptions and Beyond

from arxiv, The first work evaluates the robustness of SAM under various corruptions such as style transfer, local occlusion, and adversarial patch attack

Segment anything model (SAM), as the name suggests, is claimed to be capable of cutting out any object and demonstrates impressive zero-shot transfer performance with the guidance of a prompt. However, there is currently a lack of comprehensive evaluation regarding its robustness under various corruptions. Understanding SAM's robustness across different corruption scenarios is crucial for its real-world deployment. Prior works show that SAM is biased towards texture (style) rather than shape, motivated by which we start by investigating SAM's robustness against style transfer, which is synthetic corruption. Following the interpretation of the corruption's effect as style change, we proceed to conduct a comprehensive evaluation of the SAM for its robustness against 15 types of common corruption. These corruptions mainly fall into categories such as digital, noise, weather, and blur. Within each of these corruption categories, we explore 5 severity levels to simulate real-world corruption scenarios. Beyond the corruptions, we further assess its robustness regarding local occlusion and local adversarial patch attacks in images. To the best of our knowledge, our work is the first of its kind to evaluate the robustness of SAM under style change, local occlusion, and local adversarial patch attacks. Considering that patch attacks visible to human eyes are easily detectable, we also assess SAM's robustness against adversarial perturbations that are imperceptible to human eyes. Overall, this work provides a comprehensive empirical study on SAM's robustness, evaluating its performance under various corruptions and extending the assessment to critical aspects like local occlusion, local patch attacks, and imperceptible adversarial perturbations, which yields valuable insights into SAM's practical applicability and effectiveness in addressing real-world challenges.

翻译：分割一切模型（SAM）顾名思义宣称能够分割任意目标，并在提示引导下展现出令人瞩目的零样本迁移性能。然而，目前缺乏对其在不同退化场景下鲁棒性的全面评估。理解SAM在各种退化场景中的鲁棒性对其实际部署至关重要。先前研究表明SAM偏向纹理（风格）而非形状，受此启发，我们首先探究SAM对风格迁移（一种合成退化）的鲁棒性。在将退化效果解释为风格变化后，我们进一步对SAM在15种常见退化类型（主要涵盖数字、噪声、天气和模糊等类别）下的鲁棒性进行了全面评估。针对每类退化，我们探索了5种严重程度以模拟真实世界的退化场景。除退化外，我们还评估了其对图像局部遮挡和局部对抗性补丁攻击的鲁棒性。据我们所知，本研究首次评估了SAM在风格变化、局部遮挡和局部对抗性补丁攻击下的鲁棒性。考虑到人眼可见的补丁攻击易于检测，我们还评估了SAM对难以察觉的对抗性扰动的鲁棒性。总体而言，本研究对SAM的鲁棒性进行了全面的实证研究，评估了其在各种退化下的表现，并将评估扩展到局部遮挡、局部补丁攻击和不可感知的对抗性扰动等关键方面，为SAM在实际挑战中的应用效果和有效性提供了宝贵见解。