While a large amount of work has focused on designing adversarial attacks against image classifiers, only a few methods exist to attack semantic segmentation models. We show that attacking segmentation models presents task-specific challenges, for which we propose novel solutions. Our final evaluation protocol outperforms existing methods, and shows that those can overestimate the robustness of the models. Additionally, so far adversarial training, the most successful way for obtaining robust image classifiers, could not be successfully applied to semantic segmentation. We argue that this is because the task to be learned is more challenging, and requires significantly higher computational effort than for image classification. As a remedy, we show that by taking advantage of recent advances in robust ImageNet classifiers, one can train adversarially robust segmentation models at limited computational cost by fine-tuning robust backbones.
翻译:尽管大量工作集中于设计针对图像分类器的对抗攻击,但仅有少数方法可用于攻击语义分割模型。我们证明攻击分割模型存在任务特定的挑战,并对此提出新颖解决方案。最终评估协议优于现有方法,并表明这些方法可能高估模型的鲁棒性。此外,迄今为止,对抗训练作为获取鲁棒图像分类器最成功的手段,尚未能成功应用于语义分割。我们认为这是因为待学习的任务更具挑战性,且需比图像分类显著更高的计算量。为此,我们证明利用鲁棒ImageNet分类器的最新进展,可通过微调鲁棒骨干网络,以有限的计算成本训练出具有对抗鲁棒性的分割模型。