Deep Directly-Trained Spiking Neural Networks for Object Detection

Spiking neural networks (SNNs) are brain-inspired energy-efficient models that encode information in spatiotemporal dynamics. Recently, deep SNNs trained directly have shown great success in achieving high performance on classification tasks with very few time steps. However, how to design a directly-trained SNN for the regression task of object detection still remains a challenging problem. To address this problem, we propose EMS-YOLO, a novel directly-trained SNN framework for object detection, which is the first trial to train a deep SNN with surrogate gradients for object detection rather than ANN-SNN conversion strategies. Specifically, we design a full-spike residual block, EMS-ResNet, which can effectively extend the depth of the directly-trained SNN with low power consumption. Furthermore, we theoretically analyze and prove the EMS-ResNet could avoid gradient vanishing or exploding. The results demonstrate that our approach outperforms the state-of-the-art ANN-SNN conversion methods (at least 500 time steps) in extremely fewer time steps (only 4 time steps). It is shown that our model could achieve comparable performance to the ANN with the same architecture while consuming 5.83 times less energy on the frame-based COCO Dataset and the event-based Gen1 Dataset.

翻译：脉冲神经网络（SNN）是一种受大脑启发的节能模型，通过时空动态进行信息编码。近年来，直接训练的深层SNN在极短时间步内实现分类任务的高性能方面取得了显著成功。然而，如何为回归任务（目标检测）设计直接训练的SNN仍是一个挑战性问题。针对该问题，我们提出了EMS-YOLO——一种新颖的直接训练SNN框架用于目标检测，这是首次尝试使用替代梯度训练深层SNN实现目标检测，而非采用ANN-SNN转换策略。具体而言，我们设计了全脉冲残差模块EMS-ResNet，可在低功耗下有效扩展直接训练SNN的深度。此外，我们从理论上分析并证明了EMS-ResNet能够避免梯度消失或爆炸问题。结果表明，我们的方法在极少数时间步（仅4步）下性能优于最先进的ANN-SNN转换方法（至少500时间步）。实验显示，在基于帧的COCO数据集和基于事件的Gen1数据集上，我们的模型在取得与相同架构ANN相当性能的同时，能耗降低5.83倍。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日