Existing segmentation models exhibit significant vulnerability to adversarial attacks.To improve robustness, adversarial training incorporates adversarial examples into model training. However, existing attack methods consider only global semantic information and ignore contextual semantic relationships within the samples, limiting the effectiveness of adversarial training. To address this issue, we propose EroSeg-AT, a vulnerability-aware adversarial training framework that leverages EroSeg to generate adversarial examples. EroSeg first selects sensitive pixels based on pixel-level confidence and then progressively propagates perturbations to higher-confidence pixels, effectively disrupting the semantic consistency of the samples. Experimental results show that, compared to existing methods, our approach significantly improves attack effectiveness and enhances model robustness under adversarial training.
翻译:现有分割模型对对抗攻击表现出显著脆弱性。为提升鲁棒性,对抗训练将对抗样本纳入模型训练过程。然而,现有攻击方法仅考虑全局语义信息,忽略了样本内部的上下文语义关联,限制了对抗训练的有效性。为解决该问题,我们提出EroSeg-AT——一种基于脆弱性感知的对抗训练框架,其利用EroSeg生成对抗样本。EroSeg首先依据像素级置信度选择敏感像素,随后将扰动逐步传播至更高置信度像素,从而有效破坏样本的语义一致性。实验结果表明,相较于现有方法,本方法在提升攻击效能的同时,显著增强了对抗训练下模型的鲁棒性。