Convolutional Neural Networks (CNNs) have demonstrated remarkable ability throughout the field of computer vision. However, CNN inference requires a large number of arithmetic operations, making them expensive to deploy in hardware. Current approaches alleviate this issue by developing hardware-supported, algorithmic processes to simplify spatial convolution functions. However, these methods still heavily rely on matrix multiplication, leading to significant computational overhead. To bridge the gap between hardware, algorithmic acceleration, and approximate matrix multiplication, we propose TabConv, a novel, table-based approximation for convolution to significantly reduce arithmetic operations during inference. Additionally, we introduce a priority masking technique based on cosine similarity to select layers for table-based approximation, thereby maintaining the model performance. We evaluate our approach on popular CNNs: ResNet-18, ResNet-34, and NetworkInNetwork (NIN). TabConv preserves over 93% of the original model's performance while reducing arithmetic operations by 36.5%, 25.8%, and 99.4% for ResNet-18 on CIFAR-10, CIFAR-100, and MNIST, respectively, 35.6% and 99.3% for ResNet-34 on CIFAR-10 and MNIST, and 98.9% for NIN on MNIST, achieving low-computation inference.
翻译:卷积神经网络在计算机视觉领域展现出卓越的能力。然而,CNN推理需要大量算术运算,导致其在硬件部署中成本高昂。现有方法通过开发硬件支持的算法流程来简化空间卷积函数,从而缓解这一问题。但这些方法仍严重依赖矩阵乘法,导致显著的计算开销。为弥合硬件加速、算法优化与近似矩阵乘法之间的差距,我们提出TabConv——一种基于查找表的新型卷积近似方法,可大幅减少推理过程中的算术运算。此外,我们引入基于余弦相似度的优先级掩码技术,选择适合进行表近似的网络层,从而维持模型性能。我们在主流CNN架构(ResNet-18、ResNet-34和NetworkInNetwork)上评估该方法。TabConv在CIFAR-10、CIFAR-100和MNIST数据集上分别将ResNet-18的算术运算量降低36.5%、25.8%和99.4%,在CIFAR-10和MNIST上将ResNet-34的运算量降低35.6%和99.3%,在MNIST上将NIN的运算量降低98.9%,同时保持原始模型93%以上的性能,实现了低计算量推理。