Recent advancements in neuroscience research have propelled the development of Spiking Neural Networks (SNNs), which not only have the potential to further advance neuroscience research but also serve as an energy-efficient alternative to Artificial Neural Networks (ANNs) due to their spike-driven characteristics. However, previous studies often neglected the multiscale information and its spatiotemporal correlation between event data, leading SNN models to approximate each frame of input events as static images. We hypothesize that this oversimplification significantly contributes to the performance gap between SNNs and traditional ANNs. To address this issue, we have designed a Spiking Multiscale Attention (SMA) module that captures multiscale spatiotemporal interaction information. Furthermore, we developed a regularization method named Attention ZoneOut (AZO), which utilizes spatiotemporal attention weights to reduce the model's generalization error through pseudo-ensemble training. Our approach has achieved state-of-the-art results on mainstream neural morphology datasets. Additionally, we have reached a performance of 77.1% on the Imagenet-1K dataset using a 104-layer ResNet architecture enhanced with SMA and AZO. This achievement confirms the state-of-the-art performance of SNNs with non-transformer architectures and underscores the effectiveness of our method in bridging the performance gap between SNN models and traditional ANN models.
翻译:神经科学研究的近期进展推动了脉冲神经网络的发展,该网络不仅具备进一步推动神经科学研究的潜力,且因其脉冲驱动特性,可作为人工神经网络的高效能替代方案。然而,先前研究常忽略事件数据中的多尺度信息及其时空相关性,导致SNN模型将输入事件的每一帧近似为静态图像处理。我们假设这种过度简化是造成SNN与传统ANN性能差距的重要原因。为解决此问题,我们设计了脉冲多尺度注意力模块,以捕捉多尺度时空交互信息。此外,我们开发了一种名为注意力区域退出的正则化方法,该方法利用时空注意力权重,通过伪集成训练降低模型的泛化误差。我们的方法在主流神经形态数据集上取得了最先进的结果。同时,我们在采用SMA与AZO增强的104层ResNet架构上,于Imagenet-1K数据集实现了77.1%的性能。这一成果证实了非Transformer架构SNN的顶尖性能,并凸显了我们的方法在缩小SNN模型与传统ANN模型性能差距方面的有效性。