We propose Multiplier-less INTeger (MINT) quantization, an efficient uniform quantization scheme for the weights and membrane potentials in spiking neural networks (SNNs). Unlike prior SNN quantization works, MINT quantizes the memory-hungry membrane potentials to extremely low precision (2-bit) to significantly reduce the total memory footprint. Additionally, MINT quantization shares the quantization scaling factor between the weights and membrane potentials, eliminating the need for multipliers that are necessary for vanilla uniform quantization. Experimental results demonstrate that our proposed method achieves accuracy that matches the full-precision models and other state-of-the-art SNN quantization works while outperforming them on total memory footprint and hardware cost at deployment. For instance, 2-bit MINT VGG-16 achieves 90.6% accuracy on CIFAR-10 with approximately 93.8% reduction in total memory footprint from the full-precision model; meanwhile, it reduces 90% computation energy compared to the vanilla uniform quantization at deployment.
翻译:我们提出无乘法器整数量化方法(MINT),这是一种针对脉冲神经网络(SNN)中权重与膜电位的高效均匀量化方案。与先前SNN量化工作不同,MINT将高存储占用的膜电位量化至极低比特精度(2比特),从而显著降低总内存占用。此外,MINT量化在权重与膜电位间共享量化缩放因子,消除了标准均匀量化所必需的乘法器。实验结果表明,所提方法在精度上可匹配全精度模型及其他最优SNN量化工作,同时在部署时的总内存占用与硬件成本上表现更优。例如,2比特MINT VGG-16在CIFAR-10上达到90.6%的准确率,相较于全精度模型总内存占用降低约93.8%;同时,与部署时的标准均匀量化相比,其计算能耗降低90%。