Event-based cameras have recently shown great potential for high-speed motion estimation owing to their ability to capture temporally rich information asynchronously. Spiking Neural Networks (SNNs), with their neuro-inspired event-driven processing can efficiently handle such asynchronous data, while neuron models such as the leaky-integrate and fire (LIF) can keep track of the quintessential timing information contained in the inputs. SNNs achieve this by maintaining a dynamic state in the neuron memory, retaining important information while forgetting redundant data over time. Thus, we posit that SNNs would allow for better performance on sequential regression tasks compared to similarly sized Analog Neural Networks (ANNs). However, deep SNNs are difficult to train due to vanishing spikes at later layers. To that effect, we propose an adaptive fully-spiking framework with learnable neuronal dynamics to alleviate the spike vanishing problem. We utilize surrogate gradient-based backpropagation through time (BPTT) to train our deep SNNs from scratch. We validate our approach for the task of optical flow estimation on the Multi-Vehicle Stereo Event-Camera (MVSEC) dataset and the DSEC-Flow dataset. Our experiments on these datasets show an average reduction of 13% in average endpoint error (AEE) compared to state-of-the-art ANNs. We also explore several down-scaled models and observe that our SNN models consistently outperform similarly sized ANNs offering 10%-16% lower AEE. These results demonstrate the importance of SNNs for smaller models and their suitability at the edge. In terms of efficiency, our SNNs offer substantial savings in network parameters (48.3x) and computational energy (10.2x) while attaining ~10% lower EPE compared to the state-of-the-art ANN implementations.
翻译:事件相机因其异步捕捉丰富时间信息的能力,近年来在高速运动估计领域展现出巨大潜力。脉冲神经网络(SNN)凭借其受神经启发的异步事件驱动处理能力,可高效处理此类异步数据,而诸如漏积分点火(LIF)等神经元模型能够追踪输入中蕴含的关键时序信息。SNN通过维持神经元记忆中的动态状态来实现这一特性——在随时间推移遗忘冗余数据的同时保留重要信息。基于此,我们提出SNN在序列回归任务上可能比同等规模的模拟神经网络(ANN)表现更优。然而,深层SNN因后续层脉冲消失现象而难以训练。为此,我们提出一种具有可学习神经元动力学的自适应全脉冲框架以缓解脉冲消失问题。采用基于替代梯度的时序反向传播(BPTT)方法从零训练深层SNN。在Multi-Vehicle Stereo Event-Camera(MVSEC)数据集和DSEC-Flow数据集上验证了光学流估计任务的性能。实验表明,与最先进的ANN相比,我们的方法在平均端点误差(AEE)上平均降低13%。通过探索多个降尺度模型,发现SNN模型始终优于同等规模ANN,AEE降低10%-16%。这些结果证明了SNN在小规模模型中的优势及其在边缘端的适用性。在效率方面,与现有最优ANN实现相比,我们的SNN在参数数量(降低48.3倍)和计算能耗(降低10.2倍)上实现显著节省,同时端点误差(EPE)降低约10%。