The training for deep neural networks (DNNs) demands immense energy consumption, which restricts the development of deep learning as well as increases carbon emissions. Thus, the study of energy-efficient training for DNNs is essential. In training, the linear layers consume the most energy because of the intense use of energy-consuming full-precision (FP32) multiplication in multiply-accumulate (MAC). The energy-efficient works try to decrease the precision of multiplication or replace the multiplication with energy-efficient operations such as addition or bitwise shift, to reduce the energy consumption of FP32 multiplications. However, the existing energy-efficient works cannot replace all of the FP32 multiplications during both forward and backward propagation with low-precision energy-efficient operations. In this work, we propose an Adaptive Layer-wise Scaling PoT Quantization (ALS-POTQ) method and a Multiplication-Free MAC (MF-MAC) to replace all of the FP32 multiplications with the INT4 additions and 1-bit XOR operations. In addition, we propose Weight Bias Correction and Parameterized Ratio Clipping techniques for stable training and improving accuracy. In our training scheme, all of the above methods do not introduce extra multiplications, so we reduce up to 95.8% of the energy consumption in linear layers during training. Experimentally, we achieve an accuracy degradation of less than 1% for CNN models on ImageNet and Transformer model on the WMT En-De task. In summary, we significantly outperform the existing methods for both energy efficiency and accuracy.
翻译:深度神经网络(DNN)的训练需要消耗大量能量,这不仅限制了深度学习的发展,还增加了碳排放。因此,研究节能型DNN训练至关重要。在训练过程中,线性层因大量使用高能耗的全精度(FP32)乘法运算(乘累加操作,MAC)而消耗最多能量。现有节能方案尝试降低乘法精度,或用加法、位移等节能操作替代乘法,以减少FP32乘法的能耗。然而,现有方法无法在前向传播和反向传播中,用低精度节能操作完全替代所有FP32乘法。本文提出一种自适应层间缩放Pot量化方法(ALS-POTQ)和无乘法乘累加操作(MF-MAC),以INT4加法和1位异或(XOR)运算替代所有FP32乘法。此外,我们提出权重偏差校正和参数化比率裁剪技术,以实现稳定训练并提升精度。在我们的训练方案中,上述所有方法均不引入额外乘法,从而使训练过程中线性层的能耗降低高达95.8%。实验结果表明,在ImageNet数据集上的CNN模型和WMT英德语翻译任务上的Transformer模型中,精度下降均小于1%。综上,我们在能效和精度上均显著优于现有方法。