Spiking Neural Networks (SNNs) promise significant advantages over conventional Artificial Neural Networks (ANNs) for applications requiring real-time processing of temporally sparse data streams under strict power constraints -- a concept known as the Neuromorphic Advantage. However, the limited availability of neuromorphic hardware creates a substantial simulation-to-hardware gap that impedes algorithmic innovation, hardware-software co-design, and the development of mature open-source ecosystems. To address this challenge, we introduce Yet Another Neuromorphic Accelerator (YANA), an FPGA-based digital SNN accelerator designed to bridge this gap by providing an accessible hardware and software framework for neuromorphic computing. YANA implements a five-stage, event-driven processing pipeline that fully exploits temporal and spatial sparsity while supporting arbitrary SNN topologies through point-to-point neuron connections. The architecture features an input preprocessing scheme that maintains steady event processing at one event per cycle without buffer overflow risks, and implements hardware-efficient event-driven neuron updates using lookup tables for leak calculations. We demonstrate YANA's sparsity exploitation capabilities through experiments on the Spiking Heidelberg Digits dataset, showing near-linear scaling of inference time with both spatial and temporal sparsity levels. Deployed on the accessible AMD Kria KR260 platform, a single YANA core utilizes 740 LUTs, 918 registers, 7 BRAMS and 24 URAMs, supporting up to $2^{17}$ synapses and $2^{10}$ neurons. We release the YANA framework as an open-source project, providing an end-to-end solution for training, optimizing, and deploying SNNs that integrates with existing neuromorphic computing tools through the Neuromorphic Intermediate Representation (NIR).
翻译:脉冲神经网络(SNN)在需要于严格功耗约束下实时处理时间稀疏数据流的应用场景中,相比传统人工神经网络(ANN)展现出显著优势——这一概念被称为神经形态优势。然而,神经形态硬件的有限可用性造成了巨大的仿真-硬件鸿沟,阻碍了算法创新、硬件软件协同设计以及成熟开源生态系统的发展。为应对这一挑战,我们提出了YANA(又一神经形态加速器),这是一种基于FPGA的数字SNN加速器,旨在通过提供易于使用的神经形态计算硬件与软件框架来弥合这一鸿沟。YANA实现了五级事件驱动处理流水线,在通过点对点神经元连接支持任意SNN拓扑结构的同时,充分利用时空稀疏性。该架构配备输入预处理方案,可在无缓冲区溢出风险的情况下维持每个时钟周期处理一个事件的稳定速率,并通过查找表实现漏电流计算,完成硬件高效的事件驱动神经元更新。我们在Spiking Heidelberg Digits数据集上通过实验展示了YANA的稀疏性利用能力,结果显示推理时间随空间和时间稀疏度呈近线性扩展。部署于易于获取的AMD Kria KR260平台时,单YANA核心消耗740个LUT、918个寄存器、7个BRAM和24个URAM,支持多达$2^{17}$个突触与$2^{10}$个神经元。我们将YANA框架作为开源项目发布,提供用于训练、优化和部署SNN的端到端解决方案,并通过神经形态中间表示(NIR)与现有神经形态计算工具实现集成。