Brain-inspired spiking neural networks (SNNs) replace the multiply-accumulate operations of traditional neural networks by integrate-and-fire neurons, with the goal of achieving greater energy efficiency. Specialized hardware implementations of those neurons clearly have advantages over general-purpose devices in terms of power and performance, but exhibit poor scalability when it comes to accelerating large neural networks. DeepFire2 introduces a hardware architecture which can map large network layers efficiently across multiple super logic regions in a multi-die FPGA. That gives more control over resource allocation and parallelism, benefiting both throughput and energy consumption. Avoiding the use of lookup tables to implement the AND operations of an SNN, prevents the layer size to be limited by logic resources. A deep pipeline does not only lead to an increased clock speed of up to 600 MHz. We double the throughput and power efficiency compared to our previous version of DeepFire, which equates to an almost 10-fold improvement over other previous implementations. Importantly, we are able to deploy a large ImageNet model, while maintaining a throughput of over 1500 frames per second.
翻译:受大脑启发的脉冲神经网络(SNN)通过整合-发放神经元替代传统神经网络的乘加运算,旨在实现更高能效。针对这些神经元的专用硬件实现相较于通用设备在功耗和性能方面具有明显优势,但在加速大型神经网络时表现出较差的扩展性。DeepFire2提出了一种硬件架构,能够将大型网络层高效映射到多裸片FPGA的多个超级逻辑区域中。这使得对资源分配和并行性的控制更加灵活,从而提升吞吐量和能效。通过避免使用查找表实现SNN的与运算,防止了层规模受限于逻辑资源。深度流水线不仅使时钟频率提升至600 MHz,与上一版DeepFire相比,吞吐量和能效均翻倍,相当于较其他此前实现方案实现近10倍改进。重要的是,我们能够部署大型ImageNet模型,同时保持超过1500帧/秒的吞吐量。