Modern CNNs' high computational demands hinder edge deployment, as traditional ``hard'' sparsity (skipping mathematical zeros) loses effectiveness in deep layers or with smooth activations like Tanh. We propose a ``soft sparsity'' paradigm using a hardware efficient Most Significant Bit (MSB) proxy to skip negligible non-zero multiplications. Integrated as a custom RISC-V instruction and evaluated on LeNet-5 (MNIST), this method reduces ReLU MACs by 88.42% and Tanh MACs by 74.87% with zero accuracy loss--outperforming zero-skipping by 5x. By clock-gating inactive multipliers, we estimate power savings of 35.2% for ReLU and 29.96% for Tanh. While memory access makes power reduction sub-linear to operation savings, this approach significantly optimizes resource-constrained inference.
翻译:现代卷积神经网络(CNN)的高计算需求阻碍了其在边缘设备上的部署,因为传统的“硬稀疏性”(跳过数学零值)在深层网络或使用Tanh等平滑激活函数时效果不佳。本文提出一种“软稀疏性”范式,利用硬件高效的“最高有效位(MSB)代理”跳过可忽略的非零乘法运算。该方法集成于自定义RISC-V指令中,在LeNet-5(MNIST)上的评估结果表明:ReLU激活函数的乘累加(MAC)运算量减少88.42%,Tanh激活函数减少74.87%,且精度零损失——性能较零值跳过方法提升5倍。通过时钟门控非活跃乘法器,我们估计ReLU和Tanh的功耗分别降低35.2%和29.96%。尽管内存访问使功耗降低幅度呈亚线性关系于运算节省量,但该方法显著优化了资源受限场景下的推理过程。