Deep neural networks (DNNs) have achieved great breakthroughs in many fields such as image classification and natural language processing. However, the execution of DNNs needs to conduct massive numbers of multiply-accumulate (MAC) operations on hardware and thus incurs a large power consumption. To address this challenge, we propose a novel digital MAC design based on encoding. In this new design, the multipliers are replaced by simple logic gates to project the results onto a wide bit representation. These bits carry individual position weights, which can be trained for specific neural networks to enhance inference accuracy. The outputs of the new multipliers are added by bit-wise weighted accumulation and the accumulation results are compatible with existing computing platforms accelerating neural networks with either uniform or non-uniform quantization. Since the multiplication function is replaced by simple logic projection, the critical paths in the resulting circuits become much shorter. Correspondingly, pipelining stages in the MAC array can be reduced, leading to a significantly smaller area as well as a better power efficiency. The proposed design has been synthesized and verified by ResNet18-Cifar10, ResNet20-Cifar100 and ResNet50-ImageNet. The experimental results confirmed the reduction of circuit area by up to 79.63% and the reduction of power consumption of executing DNNs by up to 70.18%, while the accuracy of the neural networks can still be well maintained.
翻译:深度神经网络(DNN)在图像分类和自然语言处理等众多领域取得了重大突破。然而,DNN的执行需要在硬件上进行大量乘累加(MAC)运算,因此会产生巨大的功耗。为解决这一挑战,本文提出了一种基于编码的新型数字MAC设计。该设计中,乘法器被替换为简单的逻辑门,将运算结果投影到宽比特表示上。这些比特携带独立的位权重,可针对特定神经网络进行训练以提升推理精度。新型乘法器的输出通过按位加权累加进行相加,且累加结果兼容支持均匀或非均匀量化的现有神经网络加速计算平台。由于乘法功能被替换为简单的逻辑投影,所生成电路的关键路径显著缩短。相应地,MAC阵列中的流水线级数可随之减少,从而实现面积大幅缩减和能效提升。本设计已通过ResNet18-Cifar10、ResNet20-Cifar100和ResNet50-ImageNet进行了综合验证。实验结果表明,电路面积最高可缩减79.63%,执行DNN的功耗最高可降低70.18%,同时神经网络精度仍能得到良好维持。