Neuromorphic processors are well-suited for efficiently handling sparse events from event-based cameras. However, they face significant challenges in the growth of computing demand and hardware costs as the input resolution increases. This paper proposes the Trainable Region-of-Interest Prediction (TRIP), the first hardware-efficient hard attention framework for event-based vision processing on a neuromorphic processor. Our TRIP framework actively produces low-resolution Region-of-Interest (ROIs) for efficient and accurate classification. The framework exploits sparse events' inherent low information density to reduce the overhead of ROI prediction. We introduced extensive hardware-aware optimizations for TRIP and implemented the hardware-optimized algorithm on the SENECA neuromorphic processor. We utilized multiple event-based classification datasets for evaluation. Our approach achieves state-of-the-art accuracies in all datasets and produces reasonable ROIs with varying locations and sizes. On the DvsGesture dataset, our solution requires 46x less computation than the state-of-the-art while achieving higher accuracy. Furthermore, TRIP enables more than 2x latency and energy improvements on the SENECA neuromorphic processor compared to the conventional solution.
翻译:神经形态处理器非常适合高效处理来自事件相机的稀疏事件。然而,随着输入分辨率的提高,它们在计算需求和硬件成本增长方面面临重大挑战。本文提出了可训练感兴趣区域预测(TRIP),这是首个用于神经形态处理器上基于事件视觉处理的硬件高效硬注意力框架。我们的TRIP框架主动生成低分辨率感兴趣区域(ROIs),以实现高效且准确的分类。该框架利用稀疏事件固有的低信息密度来减少ROI预测的开销。我们为TRIP引入了广泛的硬件感知优化,并在SENECA神经形态处理器上实现了硬件优化算法。我们利用多个基于事件的分类数据集进行评估。我们的方法在所有数据集中都达到了最先进的准确率,并生成了具有不同位置和大小的合理ROI。在DvsGesture数据集上,我们的解决方案所需计算量比现有最优方法少46倍,同时实现了更高的准确率。此外,与传统解决方案相比,TRIP在SENECA神经形态处理器上实现了超过2倍的延迟和能效提升。