Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains a notable issue. Lowering the supply voltage is an effective strategy for reducing energy consumption. However, aggressively scaling down the supply voltage can lead to accuracy degradation due to random bit flips in static random access memory (SRAM) where model parameters are stored. To address this challenge, we introduce NeuralFuse, a novel add-on module that addresses the accuracy-energy tradeoff in low-voltage regimes by learning input transformations to generate error-resistant data representations. NeuralFuse protects DNN accuracy in both nominal and low-voltage scenarios. Moreover, NeuralFuse is easy to implement and can be readily applied to DNNs with limited access, such as non-configurable hardware or remote access to cloud-based APIs. Experimental results demonstrate that, at a 1% bit error rate, NeuralFuse can reduce SRAM memory access energy by up to 24% while recovering accuracy by up to 57%. To the best of our knowledge, this is the first model-agnostic approach (i.e., no model retraining) to address low-voltage-induced bit errors. The source code is available at https://github.com/IBM/NeuralFuse.
翻译:深度神经网络(DNN)已在机器学习中广泛应用,但其能耗问题仍不容忽视。降低供电电压是减少能耗的有效策略,然而激进地缩减供电电压会导致存储模型参数的静态随机存取存储器(SRAM)发生随机比特翻转,进而引发精度下降。为应对这一挑战,我们提出NeuralFuse——一种新型附加模块,通过学习输入变换生成抗误差数据表示,从而在低电压模式下解决精度与能耗的权衡问题。NeuralFuse在标称电压和低电压场景下均可保护DNN精度。此外,NeuralFuse易于实现,可直接应用于受限于访问条件的DNN场景(如不可配置硬件或云端API远程访问)。实验结果表明,在1%比特错误率下,NeuralFuse可将SRAM存储访问能耗降低最高24%,同时恢复精度最高达57%。据我们所知,这是首个无需模型重新训练的模型无关方法,用于解决低电压引发的比特错误。源代码已开源至https://github.com/IBM/NeuralFuse。