Sharpness-aware minimization (SAM) has well documented merits in enhancing generalization of deep neural networks, even without sizable data augmentation. Embracing the geometry of the loss function, where neighborhoods of 'flat minima' heighten generalization ability, SAM seeks 'flat valleys' by minimizing the maximum loss caused by an adversary perturbing parameters within the neighborhood. Although critical to account for sharpness of the loss function, such an 'over-friendly adversary' can curtail the outmost level of generalization. The novel approach of this contribution fosters stabilization of adversaries through variance suppression (VaSSO) to avoid such friendliness. VaSSO's provable stability safeguards its numerical improvement over SAM in model-agnostic tasks, including image classification and machine translation. In addition, experiments confirm that VaSSO endows SAM with robustness against high levels of label noise.
翻译:锐度感知最小化(SAM)在增强深度神经网络泛化能力方面具有显著优势,即使在没有大量数据增强的情况下也是如此。该方法利用损失函数的几何特性(其中"平坦最小值"邻域可提升泛化能力),通过最小化邻域内对抗性参数扰动所导致的最大损失来寻找"平坦谷底"。尽管考虑损失函数的锐度至关重要,但这种"过度友好的对抗者"可能会限制泛化能力的上限。本文提出的创新方法通过方差抑制(VaSSO)来稳定对抗者,从而避免这种友好性。VaSSO具有可证明的稳定性,在包括图像分类和机器翻译在内的模型无关任务中,其数值表现均优于SAM。此外,实验证实VaSSO使SAM能够有效应对高标签噪声环境。