Event-based sensors, with their high temporal resolution (1us) and dynamical range (120dB), have the potential to be deployed in high-speed platforms such as vehicles and drones. However, the highly sparse and fluctuating nature of events poses challenges for conventional object detection techniques based on Artificial Neural Networks (ANNs). In contrast, Spiking Neural Networks (SNNs) are well-suited for representing event-based data due to their inherent temporal dynamics. In particular, we demonstrate that the membrane potential dynamics can modulate network activity upon fluctuating events and strengthen features of sparse input. In addition, the spike-triggered adaptive threshold can stabilize training which further improves network performance. Based on this, we develop an efficient spiking feature pyramid network for event-based object detection. Our proposed SNN outperforms previous SNNs and sophisticated ANNs with attention mechanisms, achieving a mean average precision (map50) of 47.7% on the Gen1 benchmark dataset. This result significantly surpasses the previous best SNN by 9.7% and demonstrates the potential of SNNs for event-based vision. Our model has a concise architecture while maintaining high accuracy and much lower computation cost as a result of sparse computation. Our code will be publicly available.
翻译:基于事件传感器具有高时间分辨率(1微秒)和宽动态范围(120dB)的特性,其有望被部署在车辆和无人机等高速运动平台上。然而,事件数据的极度稀疏性和波动性对基于人工神经网络(ANN)的传统目标检测技术提出了挑战。相比之下,脉冲神经网络(SNN)因其固有的时间动力学特性,更适合表征事件数据。我们特别证明了膜电位动力学能够调节网络对波动事件的响应,并增强稀疏输入的显著特征。此外,脉冲触发的自适应阈值可以稳定训练过程,进一步提升网络性能。基于此,我们开发了一种高效的脉冲特征金字塔网络用于事件驱动目标检测。所提出的SNN方法优于此前基于SNN以及采用注意力机制的复杂ANN方法,在Gen1基准数据集上实现了47.7%的平均精度(map50)。该结果较此前最优SNN方法显著提升9.7%,充分展示了SNN在事件驱动视觉领域的潜力。我们的模型结构简洁,在保持高精度的同时,通过稀疏计算显著降低了计算开销。相关代码将公开发布。