Energy-efficient deep neural network (DNN) accelerators are prone to non-idealities that degrade DNN performance at inference time. To mitigate such degradation, existing methods typically add perturbations to the DNN weights during training to simulate inference on noisy hardware. However, this often requires knowledge about the target hardware and leads to a trade-off between DNN performance and robustness, decreasing the former to increase the latter. In this work, we show that applying sharpness-aware training, by optimizing for both the loss value and loss sharpness, significantly improves robustness to noisy hardware at inference time without relying on any assumptions about the target hardware. In particular, we propose a new adaptive sharpness-aware method that conditions the worst-case perturbation of a given weight not only on its magnitude but also on the range of the weight distribution. This is achieved by performing sharpness-aware minimization scaled by outlier minimization (SAMSON). Our approach outperforms existing sharpness-aware training methods both in terms of model generalization performance in noiseless regimes and robustness in noisy settings, as measured on several architectures and datasets.
翻译:能效型深度神经网络加速器在推理时易受非理想特性影响,导致网络性能下降。为缓解此类退化,现有方法通常在训练期间对网络权重添加扰动以模拟噪声硬件上的推理过程。然而,这类方法往往需要了解目标硬件的具体特性,并导致网络性能与鲁棒性之间的权衡——通过降低前者来提升后者。本研究表明,通过同时优化损失值与损失锐度来应用锐度感知训练,可在不依赖目标硬件任何假设的前提下,显著提升推理时噪声硬件的鲁棒性。具体而言,我们提出了一种新的自适应锐度感知方法,该方法对给定权重的极端扰动进行条件约束,不仅考虑其幅度大小,还结合了权重分布的取值范围。这一目标通过基于离群最小化缩放的锐度感知最小化(SAMSON)实现。在多个架构与数据集上的实验表明,我们的方法在无噪声环境下的模型泛化性能与噪声环境下的鲁棒性方面均优于现有锐度感知训练方法。