Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains problematically high. An effective strategy for reducing such consumption is supply-voltage reduction, but if done too aggressively, it can lead to accuracy degradation. This is due to random bit-flips in static random access memory (SRAM), where model parameters are stored. To address this challenge, we have developed NeuralFuse, a novel add-on module that handles the energy-accuracy tradeoff in low-voltage regimes by learning input transformations and using them to generate error-resistant data representations, thereby protecting DNN accuracy in both nominal and low-voltage scenarios. As well as being easy to implement, NeuralFuse can be readily applied to DNNs with limited access, such cloud-based APIs that are accessed remotely or non-configurable hardware. Our experimental results demonstrate that, at a 1% bit-error rate, NeuralFuse can reduce SRAM access energy by up to 24% while recovering accuracy by up to 57%. To the best of our knowledge, this is the first approach to addressing low-voltage-induced bit errors that requires no model retraining.
翻译:深度神经网络(DNN)在机器学习领域已无处不在,但其能耗问题依然突出。降低供电电压是减少此类能耗的有效策略,但若操作过于激进,则会导致精度下降。这源于存储模型参数的静态随机存取存储器(SRAM)中发生的随机比特翻转。为应对这一挑战,我们开发了NeuralFuse——一种新型附加模块,通过学习输入变换并利用其生成抗误差的数据表示,从而在低电压场景下处理能量-精度权衡问题,保护DNN在标称电压和低电压场景下的精度。NeuralFuse不仅易于实现,还能直接应用于访问受限的DNN,例如远程访问的基于云的API或不可配置的硬件。实验结果表明,在1%比特错误率下,NeuralFuse可降低SRAM访问能耗达24%,同时恢复精度达57%。据我们所知,这是首个无需模型重新训练即可解决低电压引发比特错误的方法。