TabConv: Low-Computation CNN Inference via Table Lookups

Convolutional Neural Networks (CNNs) have demonstrated remarkable ability throughout the field of computer vision. However, CNN inference requires a large number of arithmetic operations, making them expensive to deploy in hardware. Current approaches alleviate this issue by developing hardware-supported, algorithmic processes to simplify spatial convolution functions. However, these methods still heavily rely on matrix multiplication, leading to significant computational overhead. To bridge the gap between hardware, algorithmic acceleration, and approximate matrix multiplication, we propose TabConv, a novel, table-based approximation for convolution to significantly reduce arithmetic operations during inference. Additionally, we introduce a priority masking technique based on cosine similarity to select layers for table-based approximation, thereby maintaining the model performance. We evaluate our approach on popular CNNs: ResNet-18, ResNet-34, and NetworkInNetwork (NIN). TabConv preserves over 93% of the original model's performance while reducing arithmetic operations by 36.5%, 25.8%, and 99.4% for ResNet-18 on CIFAR-10, CIFAR-100, and MNIST, respectively, 35.6% and 99.3% for ResNet-34 on CIFAR-10 and MNIST, and 98.9% for NIN on MNIST, achieving low-computation inference.

翻译：卷积神经网络在计算机视觉领域展现出卓越的能力。然而，CNN推理需要大量算术运算，导致其在硬件部署中成本高昂。现有方法通过开发硬件支持的算法流程来简化空间卷积函数，从而缓解这一问题。但这些方法仍严重依赖矩阵乘法，导致显著的计算开销。为弥合硬件加速、算法优化与近似矩阵乘法之间的差距，我们提出TabConv——一种基于查找表的新型卷积近似方法，可大幅减少推理过程中的算术运算。此外，我们引入基于余弦相似度的优先级掩码技术，选择适合进行表近似的网络层，从而维持模型性能。我们在主流CNN架构（ResNet-18、ResNet-34和NetworkInNetwork）上评估该方法。TabConv在CIFAR-10、CIFAR-100和MNIST数据集上分别将ResNet-18的算术运算量降低36.5%、25.8%和99.4%，在CIFAR-10和MNIST上将ResNet-34的运算量降低35.6%和99.3%，在MNIST上将NIN的运算量降低98.9%，同时保持原始模型93%以上的性能，实现了低计算量推理。

相关内容

MNIST (数据集)

关注 1

MNIST 数据集来自美国国家标准与技术研究所, National Institute of Standards and Technology (NIST). 训练集 (training set) 由来自 250 个不同人手写的数字构成, 其中 50% 是高中学生, 50% 来自人口普查局 (the Census Bureau) 的工作人员. 测试集(test set) 也是同样比例的手写数字数据。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日