There is a growing necessity for edge training to adapt to dynamically changing environment. Neuromorphic computing represents a significant pathway for high-efficiency intelligent computation in energy-constrained edges, but existing neuromorphic architectures lack the ability of directly training spiking neural networks (SNNs) based on backpropagation. We develop a multi-core neuromorphic architecture with Feedforward-Propagation, Back-Propagation, and Weight-Gradient engines in each core, supporting high efficient parallel computing at both the engine and core levels. It combines various data flows and sparse computation optimization by fully leveraging the sparsity in SNN training, obtaining a high energy efficiency of 1.05TFLOPS/W@ FP16 @ 28nm, 55 ~ 85% reduction of DRAM access compared to A100 GPU in SNN trainings, and a 20-core deep SNN training and a 5-worker federated learning on FPGAs. Our study develops the first multi-core neuromorphic architecture supporting the direct SNN training, facilitating the neuromorphic computing in edge-learnable applications.
翻译:为适应动态变化的环境,边缘训练的需求日益增长。神经形态计算是能量受限边缘场景中实现高效智能计算的重要途径,但现有神经形态架构缺乏基于反向传播直接训练脉冲神经网络(SNN)的能力。我们开发了一种多核神经形态架构,每个核内集成前向传播、反向传播和权重梯度引擎,支持引擎级与核级的双重高效并行计算。该架构通过充分利用SNN训练中的稀疏性,结合多种数据流与稀疏计算优化,实现了1.05 TFLOPS/W@ FP16 @ 28nm的高能效表现,在SNN训练中相比A100 GPU的DRAM访问量减少55%~85%,并在FPGA上完成了20核深度SNN训练及5节点联邦学习验证。本研究首次构建了支持直接SNN训练的多核神经形态架构,为可学习边缘应用中的神经形态计算发展提供了关键支撑。