Best of Both Worlds: Hybrid SNN-ANN Architecture for Event-based Optical Flow Estimation

In the field of robotics, event-based cameras are emerging as a promising low-power alternative to traditional frame-based cameras for capturing high-speed motion and high dynamic range scenes. This is due to their sparse and asynchronous event outputs. Spiking Neural Networks (SNNs) with their asynchronous event-driven compute, show great potential for extracting the spatio-temporal features from these event streams. In contrast, the standard Analog Neural Networks (ANNs) fail to process event data effectively. However, training SNNs is difficult due to additional trainable parameters (thresholds and leaks), vanishing spikes at deeper layers, and a non-differentiable binary activation function. Furthermore, an additional data structure, membrane potential, responsible for keeping track of temporal information, must be fetched and updated at every timestep in SNNs. To overcome these challenges, we propose a novel SNN-ANN hybrid architecture that combines the strengths of both. Specifically, we leverage the asynchronous compute capabilities of SNN layers to effectively extract the input temporal information. Concurrently, the ANN layers facilitate training and efficient hardware deployment on traditional machine learning hardware such as GPUs. We provide extensive experimental analysis for assigning each layer to be spiking or analog, leading to a network configuration optimized for performance and ease of training. We evaluate our hybrid architecture for optical flow estimation on DSEC-flow and Multi-Vehicle Stereo Event-Camera (MVSEC) datasets. On the DSEC-flow dataset, the hybrid SNN-ANN architecture achieves a 40% reduction in average endpoint error (AEE) with 22% lower energy consumption compared to Full-SNN, and 48% lower AEE compared to Full-ANN, while maintaining comparable energy usage.

翻译：在机器人领域，事件相机凭借其稀疏且异步的事件输出，正成为传统帧相机在高动态范围场景与高速运动捕捉方面极具前景的低功耗替代方案。脉冲神经网络（SNN）因其异步事件驱动的计算特性，在提取事件流时空特征方面展现出巨大潜力；而传统的模拟神经网络（ANN）则难以有效处理事件数据。然而，SNN的训练面临诸多挑战：额外的可训练参数（阈值和泄漏项）、深层节点的脉冲衰减问题，以及不可微的二值激活函数。此外，SNN在每个时间步都需要获取并更新记录时序信息的膜电位这一附加数据结构。为解决上述难题，我们提出一种融合两类网络优势的新型SNN-ANN混合架构。具体而言，该架构利用SNN层的异步计算能力高效提取输入的时序信息，同时通过ANN层简化训练过程并在GPU等传统机器学习硬件上实现高效部署。我们通过大量实验分析逐层分配脉冲型或模拟型计算单元，最终构建出兼顾性能与训练便捷性的网络配置。在DSEC-flow与多车辆立体事件相机（MVSEC）数据集上对光流估计任务的评估表明：相较于全SNN架构，混合SNN-ANN架构在DSEC-flow数据集上平均端点误差（AEE）降低40%，能耗降低22%；相较于全ANN架构，AEE降低48%，能耗基本持平。