The recent success of Vision Transformers has generated significant interest in attention mechanisms and transformer architectures. Although existing methods have proposed spiking self-attention mechanisms compatible with spiking neural networks, they often face challenges in effective deployment on current neuromorphic platforms. This paper introduces a novel model that combines vision transformers with the Locally Competitive Algorithm (LCA) to facilitate efficient neuromorphic deployment. Our experiments show that ViT-LCA achieves higher accuracy on ImageNet-1K dataset while consuming significantly less energy than other spiking vision transformer counterparts. Furthermore, ViT-LCA's neuromorphic-friendly design allows for more direct mapping onto current neuromorphic architectures.
翻译:近年来,视觉Transformer的成功引起了人们对注意力机制和Transformer架构的浓厚兴趣。尽管现有方法已提出与脉冲神经网络兼容的脉冲自注意力机制,但它们在当前神经形态平台上的有效部署仍面临挑战。本文提出了一种新颖的模型,它将视觉Transformer与局部竞争算法相结合,以促进高效的神经形态部署。我们的实验表明,ViT-LCA在ImageNet-1K数据集上实现了更高的准确率,同时能耗显著低于其他脉冲视觉Transformer模型。此外,ViT-LCA的神经形态友好型设计使其能够更直接地映射到当前的神经形态架构上。