Spiking neural networks (SNNs) are posited as a computationally efficient and biologically plausible alternative to conventional neural architectures, with their core computational framework primarily using the leaky integrate-and-fire (LIF) neuron model. However, the limited hidden state representation of LIF neurons, characterized by a scalar membrane potential, and sequential spike generation process, poses challenges for effectively developing scalable spiking models to address long-range dependencies in sequence learning tasks. In this study, we develop a scalable probabilistic spiking learning framework for long-range dependency tasks leveraging the fundamentals of state space models. Unlike LIF neurons that rely on the determinitic Heaviside function for a sequential process of spike generation, we introduce a SpikeSampler layer that samples spikes stochastically based on an SSM-based neuronal model while allowing parallel computations. To address non-differentiability of the spiking operation and enable effective training, we also propose a surrogate function tailored for the stochastic nature of the SpikeSampler layer. To enhance inter-neuron communication, we introduce the SpikeMixer block, which integrates spikes from neuron populations in each layer. This is followed by a ClampFuse layer, incorporating a residual connection to capture complex dependencies, enabling scalability of the model. Our models attain state-of-the-art performance among SNN models across diverse long-range dependency tasks, encompassing the Long Range Arena benchmark, permuted sequential MNIST, and the Speech Command dataset and demonstrate sparse spiking pattern highlighting its computational efficiency.
翻译:脉冲神经网络(SNNs)被认为是一种计算高效且具有生物合理性的传统神经架构替代方案,其核心计算框架主要使用泄漏积分发放(LIF)神经元模型。然而,LIF神经元有限的隐藏状态表示(以标量膜电位为特征)以及顺序的脉冲生成过程,对有效开发可扩展的脉冲模型以解决序列学习任务中的长程依赖提出了挑战。在本研究中,我们基于状态空间模型的基本原理,开发了一个用于长程依赖任务的可扩展概率脉冲学习框架。与依赖确定性Heaviside函数进行顺序脉冲生成的LIF神经元不同,我们引入了SpikeSampler层,该层基于SSM神经元模型随机采样脉冲,同时允许并行计算。为了解决脉冲操作不可微的问题并实现有效训练,我们还为SpikeSampler层的随机特性量身定制了一个代理函数。为了增强神经元间的通信,我们引入了SpikeMixer模块,该模块整合了每层神经元群体的脉冲。随后是一个ClampFuse层,它结合了残差连接以捕获复杂依赖关系,从而实现了模型的可扩展性。我们的模型在多种长程依赖任务(包括Long Range Arena基准测试、置换顺序MNIST和Speech Command数据集)中,在SNN模型中达到了最先进的性能,并展示了稀疏的脉冲模式,突显了其计算效率。