Spiking Neural Networks (SNNs), with their inherent recurrence, offer an efficient method for processing the asynchronous temporal data generated by Dynamic Vision Sensors (DVS), making them well-suited for event-based vision applications. However, existing SNN accelerators suffer from limitations in adaptability to diverse neuron models, bit precisions and network sizes, inefficient membrane potential (Vmem) handling, and limited sparse optimizations. In response to these challenges, we propose a scalable and reconfigurable digital compute-in-memory (CIM) SNN accelerator \chipname with a set of key features: 1) It uses in-memory computations and reconfigurable operating modes to minimize data movement associated with weight and Vmem data structures while efficiently adapting to different workloads. 2) It supports multiple weight/Vmem bit precision values, enabling a trade-off between accuracy and energy efficiency and enhancing adaptability to diverse application demands. 3) A zero-skipping mechanism for sparse inputs significantly reduces energy usage by leveraging the inherent sparsity of spikes without introducing high overheads for low sparsity. 4) Finally, the asynchronous handshaking mechanism maintains the computational efficiency of the pipeline for variable execution times of different computation units. We fabricated \chipname in 65 nm Taiwan Semiconductor Manufacturing Company (TSMC) low-power (LP) technology. It demonstrates competitive performance (scaled to the same technology node) to other digital SNN accelerators proposed in the recent literature and supports advanced reconfigurability. It achieves up to 5 TOPS/W energy efficiency at 95% input sparsity with 4-bit weights and 7-bit Vmem precision.
翻译:脉冲神经网络(SNNs)凭借其固有的递归特性,为处理动态视觉传感器(DVS)产生的异步时序数据提供了一种高效方法,使其非常适用于基于事件的视觉应用。然而,现有的SNN加速器在适应不同神经元模型、比特精度和网络规模方面存在局限,存在膜电位处理效率低下以及稀疏性优化不足等问题。针对这些挑战,我们提出了一种可扩展且可重构的数字存内计算(CIM)SNN加速器\chipname,其具备以下关键特性:1)它采用存内计算和可重构操作模式,以最小化与权重和膜电位数据结构相关的数据移动,同时高效适应不同的工作负载。2)它支持多种权重/膜电位比特精度值,实现了精度与能效之间的权衡,并增强了对多样化应用需求的适应性。3)针对稀疏输入的零值跳过机制,通过利用脉冲固有的稀疏性,显著降低了能耗,且不会在稀疏度较低时引入过高开销。4)最后,异步握手机制维持了流水线在不同计算单元执行时间可变情况下的计算效率。我们在65纳米台积电(TSMC)低功耗(LP)工艺上流片制造了\chipname。与近期文献中提出的其他数字SNN加速器相比(缩放至相同工艺节点),它展现了具有竞争力的性能,并支持高级可重构性。在使用4位权重和7位膜电位精度、输入稀疏度为95%的条件下,其能效最高可达5 TOPS/W。