Elastic Spiking Transformers for Efficient Gesture Understanding

Spiking Neural Networks (SNNs), particularly Spiking Transformers, offer energy-efficient processing of event-based sensor data for healthcare applications. Yet current architectures are rigid: they are trained and deployed as static networks with fixed parameter counts and computational graphs. This limits deployment on neuromorphic hardware such as Loihi and SpiNNaker, where on-chip constraints often require smaller models that trade accuracy for feasibility. We introduce the Elastic Spiking Transformer, a runtime-adaptive architecture that brings elasticity into the spiking paradigm. Inspired by Matryoshka-style representation learning, it embeds nested elasticity in the Feature Extractor, Spiking Self-Attention, and Feed-Forward blocks. Through granularity-aware weight sharing, a single universal model can dynamically slice network width and attention heads at inference time without retraining. This design provides two key advantages for SNNs. First, it allows the model to adjust its parameter footprint to different hardware memory budgets. Second, reducing active neurons also lowers spike firing rates, yielding proportional reductions in synaptic operations, an energy benefit not directly available in standard artificial neural networks. We evaluate the approach on CIFAR10/100, CIFAR10-DVS, and the EHWGesture clinical gesture understanding dataset. Results show that one Elastic Spiking Transformer spans a broad range of complexity-accuracy trade-offs, matching or surpassing independently trained baselines while supporting adaptive, real-time gesture recognition on resource-constrained edge devices.

翻译：脉冲神经网络（SNNs），特别是脉冲Transformer，为医疗保健应用中基于事件传感器数据的处理提供了节能方案。然而当前架构缺乏灵活性：它们作为具有固定参数量和计算图的静态网络进行训练和部署。这限制了其在Loihi和SpiNNaker等神经形态硬件上的部署，因为芯片约束往往需要更小的模型，以牺牲精度换取可行性。我们提出弹性脉冲Transformer（Elastic Spiking Transformer），一种运行时自适应架构，将弹性引入脉冲范式。受Matryoshka风格表示学习的启发，它在特征提取器、脉冲自注意力机制和前馈模块中嵌入了嵌套弹性。通过粒度感知权重共享，单个通用模型可在推理时动态调整网络宽度和注意力头数，无需重新训练。该设计为SNN提供了两大优势：首先，它使模型能够根据不同的硬件内存预算调整参数规模；其次，减少活跃神经元数量也会降低脉冲发放率，从而成比例减少突触操作，这种能效优势在标准人工神经网络中无法直接实现。我们在CIFAR10/100、CIFAR10-DVS以及EHWGesture临床手势理解数据集上评估了该方法。结果表明，单个弹性脉冲Transformer可覆盖广泛的复杂度-精度权衡范围，在匹配或超越独立训练基线模型的同时，支持资源受限边缘设备上的自适应实时手势识别。