Rethinking Residual Connection in Training Large-Scale Spiking Neural Networks

Spiking Neural Network (SNN) is known as the most famous brain-inspired model, but the non-differentiable spiking mechanism makes it hard to train large-scale SNNs. To facilitate the training of large-scale SNNs, many training methods are borrowed from Artificial Neural Networks (ANNs), among which deep residual learning is the most commonly used. But the unique features of SNNs make prior intuition built upon ANNs not available for SNNs. Although there are a few studies that have made some pioneer attempts on the topology of Spiking ResNet, the advantages of different connections remain unclear. To tackle this issue, we analyze the merits and limitations of various residual connections and empirically demonstrate our ideas with extensive experiments. Then, based on our observations, we abstract the best-performing connections into densely additive (DA) connection, extend such a concept to other topologies, and propose four architectures for training large-scale SNNs, termed DANet, which brings up to 13.24% accuracy gain on ImageNet. Besides, in order to present a detailed methodology for designing the topology of large-scale SNNs, we further conduct in-depth discussions on their applicable scenarios in terms of their performance on various scales of datasets and demonstrate their advantages over prior architectures. At a low training expense, our best-performing ResNet-50/101/152 obtain 73.71%/76.13%/77.22% top-1 accuracy on ImageNet with 4 time steps. We believe that this work shall give more insights for future works to design the topology of their networks and promote the development of large-scale SNNs. The code will be publicly available.

翻译：脉冲神经网络（SNN）被誉为最著名的类脑模型，但其不可微的脉冲机制使得大规模SNN的训练面临挑战。为促进大规模SNN的训练，许多人工神经网络（ANN）的训练方法被借鉴，其中深度残差学习应用最为广泛。然而SNN的独特特性使得基于ANN建立的先验直觉不适用于SNN。尽管已有少数研究对脉冲ResNet拓扑结构进行了开创性探索，但不同连接方式的优势仍不明确。针对这一问题，我们分析了各类残差连接的优缺点，并通过大量实验实证验证了观点。基于观察结果，我们将性能最优的连接抽象为密集加性（DA）连接，并将该概念扩展至其他拓扑结构，提出四种适用于大规模SNN训练的架构——DANet，在ImageNet上实现了高达13.24%的准确率提升。此外，为提供设计大规模SNN拓扑的详细方法论，我们进一步深入讨论了这些架构在不同规模数据集上的性能表现及其适用场景，并展示了相较于先前架构的优势。在低训练成本下，我们性能最优的ResNet-50/101/152在4个时间步下于ImageNet上分别达到73.71%/76.13%/77.22%的Top-1准确率。我们相信，本工作将为未来网络拓扑设计提供新思路，并推动大规模SNN的发展。相关代码将公开发布。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日