Including Artificial Neural Networks in embedded systems at the edge allows applications to exploit Artificial Intelligence capabilities directly within devices operating at the network periphery. This paper introduces Spiker+, a comprehensive framework for generating efficient, low-power, and low-area customized Spiking Neural Networks (SNN) accelerators on FPGA for inference at the edge. Spiker+ presents a configurable multi-layer hardware SNN, a library of highly efficient neuron architectures, and a design framework, enabling the development of complex neural network accelerators with few lines of Python code. Spiker+ is tested on two benchmark datasets, the MNIST and the Spiking Heidelberg Digits (SHD). On the MNIST, it demonstrates competitive performance compared to state-of-the-art SNN accelerators. It outperforms them in terms of resource allocation, with a requirement of 7,612 logic cells and 18 Block RAMs (BRAMs), which makes it fit in very small FPGA, and power consumption, draining only 180mW for a complete inference on an input image. The latency is comparable to the ones observed in the state-of-the-art, with 780us/img. To the authors' knowledge, Spiker+ is the first SNN accelerator tested on the SHD. In this case, the accelerator requires 18,268 logic cells and 51 BRAM, with an overall power consumption of 430mW and a latency of 54 us for a complete inference on input data. This underscores the significance of Spiker+ in the hardware-accelerated SNN landscape, making it an excellent solution to deploy configurable and tunable SNN architectures in resource and power-constrained edge applications.
翻译:在边缘嵌入式系统中集成人工神经网络,使得应用能够直接在网络外围设备中利用人工智能能力。本文介绍Spiker+,一个用于在FPGA上生成高效、低功耗、低面积定制化脉冲神经网络(SNN)加速器的综合框架,专为边缘推理设计。Spiker+提供可配置的多层硬件SNN、高效神经元架构库以及设计框架,仅需数行Python代码即可开发复杂的神经网络加速器。该框架在两个基准数据集(MNIST和Spiking Heidelberg Digits,SHD)上进行了测试。在MNIST数据集上,Spiker+展现出与现有最优SNN加速器相当的竞争力,并在资源分配上表现更优,仅需7,612个逻辑单元和18个块RAM,使其能适配超小型FPGA;功耗方面,单张输入图像的完整推理仅消耗180mW。其延迟与现有最优方案相当,为780μs/图像。据作者所知,Spiker+是首个在SHD数据集上测试的SNN加速器。在该场景下,加速器需18,268个逻辑单元和51个块RAM,总功耗430mW,输入数据完整推理延迟为54μs。这凸显了Spiker+在硬件加速SNN领域的重要性,使其成为在资源和功耗受限的边缘应用中部署可配置、可调谐SNN架构的优秀解决方案。