Sharpness-Aware Minimization (SAM) was introduced to improve generalization by seeking flat minima, yet it also exhibits robustness to label noise, a phenomenon that remains only partially understood. Prior work has mainly attributed this effect to SAM's tendency to prolong the learning of clean samples. In this work, we provide a complementary explanation by analyzing SAM at the element-wise level. We show that when noisy gradients dominate a parameter direction, their influence is reduced by the stronger amplification of clean gradients. This slows the memorization of noisy labels while sustaining clean learning, offering a more complete account of SAM's robustness. Building on this insight, we propose SANER (Sharpness-Aware Noise-Explicit Reweighting), a simple variant of SAM that explicitly magnifies this down-weighting effect. Experiments on benchmark image classification tasks with noisy labels demonstrate that SANER significantly mitigates noisy-label memorization and improves generalization over both SAM and SGD. Moreover, since SANER is designed from the mechanism of SAM, it can also be seamlessly integrated into SAM-like variants, further boosting their robustness.
翻译:锐度感知最小化(SAM)通过寻找平坦极小值来提升泛化能力,然而其对标签噪声也表现出鲁棒性,这一现象的机理尚未完全阐明。先前研究主要将该效应归因于SAM倾向于延长干净样本的学习时间。本文通过逐元素级分析SAM,提出了一种补充性解释。我们证明,当噪声梯度主导参数方向时,干净梯度的更强放大效应会削弱噪声梯度的影响。这种机制在维持干净样本学习的同时延缓了对噪声标签的记忆,从而更完整地解释了SAM的鲁棒性。基于这一发现,我们提出SANER(锐度感知噪声显式重加权)——一种显式放大该降权效应的SAM简化变体。在含噪声标签的基准图像分类任务上的实验表明,SANER显著抑制了噪声标签的记忆效应,并在SAM和SGD基础上提升了泛化性能。此外,由于SANER基于SAM的内在机制设计,它可无缝集成到类似SAM的变体中,进一步增强其鲁棒性。