We propose Multiplier-less INTeger (MINT) quantization, an efficient uniform quantization scheme for the weights and membrane potentials in spiking neural networks (SNNs). Unlike prior SNN quantization works, MINT quantizes the memory-hungry membrane potentials to extremely low bit-width (2-bit) to significantly reduce the total memory footprint. Additionally, MINT quantization shares the quantization scale between the weights and membrane potentials, eliminating the need for multipliers and floating arithmetic units, which are required by the standard uniform quantization. Experimental results demonstrate that our proposed method achieves accuracy that matches other state-of-the-art SNN quantization works while outperforming them on total memory footprint and hardware cost at deployment time. For instance, 2-bit MINT VGG-16 achieves 48.6% accuracy on TinyImageNet (0.28% better than the full-precision baseline) with approximately 93.8% reduction in total memory footprint from the full-precision model; meanwhile, our model reduces area by 93% and dynamic power by 98% compared to other SNN quantization counterparts.
翻译:我们提出了一种免乘数整数(MINT)量化方法,这是一种针对脉冲神经网络(SNN)中权重和膜电位的高效均匀量化方案。与以往的SNN量化工作不同,MINT将占用大量内存的膜电位量化为极低比特宽度(2比特),从而显著减少总内存占用。此外,MINT量化在权重和膜电位之间共享量化尺度,从而消除了对乘数和浮点运算单元的需求,而标准均匀量化需要这些单元。实验结果表明,我们提出的方法在精度上与其他现有最优SNN量化方法相当,同时在总内存占用和部署时的硬件成本方面表现更优。例如,2比特MINT VGG-16在TinyImageNet上达到了48.6%的精度(比全精度基准高0.28%),同时与全精度模型相比,总内存占用减少了约93.8%;此外,与其他SNN量化方法相比,我们的模型面积减少了93%,动态功耗减少了98%。