Spiking neural networks (SNNs) take inspiration from the brain to enable energy-efficient computations. Since the advent of Transformers, SNNs have struggled to compete with artificial networks on modern sequential tasks, as they inherit limitations from recurrent neural networks (RNNs), with the added challenge of training with non-differentiable binary spiking activations. However, a recent renewed interest in efficient alternatives to Transformers has given rise to state-of-the-art recurrent architectures named state space models (SSMs). This work systematically investigates, for the first time, the intersection of state-of-the-art SSMs with SNNs for long-range sequence modelling. Results suggest that SSM-based SNNs can outperform the Transformer on all tasks of a well-established long-range sequence modelling benchmark. It is also shown that SSM-based SNNs can outperform current state-of-the-art SNNs with fewer parameters on sequential image classification. Finally, a novel feature mixing layer is introduced, improving SNN accuracy while challenging assumptions about the role of binary activations in SNNs. This work paves the way for deploying powerful SSM-based architectures, such as large language models, to neuromorphic hardware for energy-efficient long-range sequence modelling.
翻译:脉冲神经网络(SNN)受大脑启发实现高能效计算。自Transformer问世以来,SNN在现代序列任务中难以与人工神经网络竞争——它们既继承了循环神经网络(RNN)的局限性,又要应对不可微的二元脉冲激活函数的训练挑战。然而,近期对Transformer高效替代方案的重新关注催生了名为状态空间模型(SSM)的先进递归架构。本文首次系统研究了最先进SSM与SNN在长序列建模中的交叉领域。结果表明,基于SSM的SNN能在成熟的长序列建模基准测试的所有任务上超越Transformer。研究还显示,基于SSM的SNN能以更少参数在序列图像分类任务中超越现有最优SNN。此外,本文引入一种新型特征混合层,在提升SNN精度的同时挑战了关于二元激活函数在SNN中作用的传统认知。本研究为将大型语言模型等基于SSM的强大架构部署至神经形态硬件,实现高能效长序列建模铺平了道路。