Spiking Vision Transformer (SViT) models are promising low-power ViT models for solving vision-based tasks with state-of-the-art performance. However, their large sizes limit their deployments for resource-constrained embedded platforms, underscoring the needs of model compression. One of prominent compression techniques is pruning, and the state-of-the-art works employ unstructured pruning techniques to compress SViT models. Such techniques require specialized hardware architectures tailored for the sparsity patterns to maximize their efficiency benefits, making this approach not scalable. To address this, we propose PSViT, a novel methodology to perform structured pruning on SViT models, hence making it possible to efficiently accelerate their inference using the existing and widely-used computing architectures. To do this, PSViT employs several key steps: uniform channel-wise filter pruning to structurally eliminate the non-significant weights, sensitivity analysis to evaluate the impact of channel-wise pruning of individual layer on accuracy and network size, as well as fine-grained channel-wise pruning based on the sensitivity analysis and the given network architecture. Experimental results show that PSViT effectively obtains 22.4% memory saving through single-shot pruning, while maintaining high accuracy within 3% (70.3% without fine-tuning and 72.8% with fine-tuning) from the original non-pruned SViT model (73.3%) on the ImageNet-1K. These results also show that the PSViT methodology advances the effort in enabling efficient SViT deployments on resource-constrained applications.
翻译:脉冲视觉Transformer(SViT)模型是解决基于视觉任务的低功耗ViT模型,具有最先进的性能。然而,它们庞大的体积限制了其在资源受限的嵌入式平台上的部署,因此亟需模型压缩。一种突出的压缩技术是剪枝,当前最先进的工作采用非结构化剪枝技术来压缩SViT模型。此类技术需针对稀疏模式定制专用硬件架构以最大化其效率优势,导致该方法可扩展性不足。为解决此问题,我们提出PSViT,一种对SViT模型进行结构化剪枝的全新方法,从而可利用现有的广泛使用的计算架构高效加速其推理。为此,PSViT采用多个关键步骤:统一通道级滤波器剪枝以结构性消除非显著权重、敏感性分析以评估逐层通道剪枝对精度与网络规模的影响,以及基于敏感性分析与给定网络架构的细粒度通道级剪枝。实验结果表明,PSViT通过单次剪枝有效节省22.4%的存储空间,同时在ImageNet-1K数据集上与原始未剪枝SViT模型(73.3%)相比,保持高精度且精度损失在3%以内(无微调时为70.3%,微调后为72.8%)。这些结果还表明,PSViT方法推动了在资源受限应用中实现高效SViT部署的进展。