We propose Multiplier-less INTeger (MINT) quantization, an efficient uniform quantization scheme for the weights and membrane potentials in spiking neural networks (SNNs). Unlike prior SNN quantization works, MINT quantizes the memory-hungry membrane potentials to extremely low bit-width (2-bit) to significantly reduce the total memory footprint. Additionally, MINT quantization shares the quantization scale between the weights and membrane potentials, eliminating the need for multipliers and floating arithmetic units, which are required by the standard uniform quantization. Experimental results demonstrate that our proposed method achieves accuracy that matches other state-of-the-art SNN quantization works while outperforming them on total memory footprint and hardware cost at deployment time. For instance, 2-bit MINT VGG-16 achieves 48.6% accuracy on TinyImageNet (0.28% better than the full-precision baseline) with approximately 93.8% reduction in total memory footprint from the full-precision model; meanwhile, our model reduces area by 93% and dynamic power by 98% compared to other SNN quantization counterparts.
翻译:摘要:我们提出无乘法器整数(MINT)量化方案,这是一种针对脉冲神经网络(SNN)中权重和膜电位的高效均匀量化方法。与先前SNN量化工作不同,MINT将存储密集型的膜电位量化为极低位宽(2比特),从而显著降低总内存占用。此外,MINT量化在权重和膜电位之间共享量化尺度,消除了标准均匀量化所需的乘法器与浮点运算单元。实验结果表明,所提方法在精度上可媲美其他前沿SNN量化方案,同时在部署阶段的总内存占用和硬件成本方面表现更优。例如,在TinyImageNet上,2比特MINT VGG-16实现了48.6%的精度(比全精度基线提升0.28%),总内存占用较全精度模型减少约93.8%;同时,与其他SNN量化方案相比,我们的模型面积减少93%,动态功耗降低98%。