The integration of self-attention mechanisms into Spiking Neural Networks (SNNs) has garnered considerable interest in the realm of advanced deep learning, primarily due to their biological properties. Recent advancements in SNN architecture, such as Spikformer, have demonstrated promising outcomes by leveraging Spiking Self-Attention (SSA) and Spiking Patch Splitting (SPS) modules. However, we observe that Spikformer may exhibit excessive energy consumption, potentially attributable to redundant channels and blocks. To mitigate this issue, we propose Auto-Spikformer, a one-shot Transformer Architecture Search (TAS) method, which automates the quest for an optimized Spikformer architecture. To facilitate the search process, we propose methods Evolutionary SNN neurons (ESNN), which optimizes the SNN parameters, and apply the previous method of weight entanglement supernet training, which optimizes the Vision Transformer (ViT) parameters. Moreover, we propose an accuracy and energy balanced fitness function $\mathcal{F}_{AEB}$ that jointly considers both energy consumption and accuracy, and aims to find a Pareto optimal combination that balances these two objectives. Our experimental results demonstrate the effectiveness of Auto-Spikformer, which outperforms the state-of-the-art method including CNN or ViT models that are manually or automatically designed while significantly reducing energy consumption.
翻译:将自注意力机制整合到脉冲神经网络(SNN)中,因其生物特性而在先进深度学习领域引起了广泛关注。近期SNN架构的进展,如Spikformer,通过利用脉冲自注意力(SSA)和脉冲分块(SPS)模块,展现了令人瞩目的成果。然而,我们观察到Spikformer可能存在能耗过高的问题,这或许归因于冗余通道和区块。为解决此问题,我们提出Auto-Spikformer,一种单次Transformer架构搜索(TAS)方法,可自动寻找优化的Spikformer架构。为促进搜索过程,我们提出了进化型SNN神经元(ESNN)方法以优化SNN参数,并应用了先前的权重纠缠超网络训练方法以优化视觉Transformer(ViT)参数。此外,我们提出了一种精度与能耗平衡的适应度函数$\mathcal{F}_{AEB}$,该函数联合考虑了能耗和精度,旨在寻找平衡这两个目标的帕累托最优组合。我们的实验结果表明,Auto-Spikformer的有效性,它在显著降低能耗的同时,超越了包括人工或自动设计的CNN或ViT模型在内的最先进方法。