The rising demand for networked embedded systems with machine intelligence has been a catalyst for sustained attempts by the research community to implement Convolutional Neural Networks (CNN) based inferencing on embedded resource-limited devices. Redesigning a CNN by removing costly multiplication operations has already shown promising results in terms of reducing inference energy usage. This paper proposes a new method for replacing multiplications in a CNN by table look-ups. Unlike existing methods that completely modify the CNN operations, the proposed methodology preserves the semantics of the major CNN operations. Conforming to the existing mechanism of the CNN layer operations ensures that the reliability of a standard CNN is preserved. It is shown that the proposed multiplication-free CNN, based on a single activation codebook, can achieve 4.7x, 5.6x, and 3.5x reduction in energy per inference in an FPGA implementation of MNIST-LeNet-5, CIFAR10-VGG-11, and Tiny ImageNet-ResNet-18 respectively. Our results show that the DietCNN approach significantly improves the resource consumption and latency of deep inference for smaller models, often used in embedded systems. Our code is available at: https://github.com/swadeykgp/DietCNN
翻译:随着具备机器智能的网络嵌入式系统需求日益增长,研究界持续致力于在资源受限的嵌入式设备上实现基于卷积神经网络(CNN)的推理。通过消除高成本乘法运算重构CNN的方法,已在降低推理能耗方面展现出积极成效。本文提出一种新的方法,通过查表操作替代CNN中的乘法运算。与完全修改CNN运算的现有方法不同,所提方法保留了主要CNN操作的语义一致性。遵循CNN层操作的现有机制,可确保标准CNN的可靠性得以保持。实验表明,基于单一激活码本的无乘法CNN在FPGA实现中,针对MNIST-LeNet-5、CIFAR10-VGG-11和Tiny ImageNet-ResNet-18分别实现每推理能耗降低4.7倍、5.6倍和3.5倍。研究结果表明,DietCNN方法显著改善了常应用于嵌入式系统的轻量级模型的深层推理资源消耗与延迟。代码开源地址:https://github.com/swadeykgp/DietCNN