Recent advances in event camera research emphasize processing data in its original sparse form, which allows the use of its unique features such as high temporal resolution, high dynamic range, low latency, and resistance to image blur. One promising approach for analyzing event data is through graph convolutional networks (GCNs). However, current research in this domain primarily focuses on optimizing computational costs, neglecting the associated memory costs. In this paper, we consider both factors together in order to achieve satisfying results and relatively low model complexity. For this purpose, we performed a comparative analysis of different graph convolution operations, considering factors such as execution time, the number of trainable model parameters, data format requirements, and training outcomes. Our results show a 450-fold reduction in the number of parameters for the feature extraction module and a 4.5-fold reduction in the size of the data representation while maintaining a classification accuracy of 52.3%, which is 6.3% higher compared to the operation used in state-of-the-art approaches. To further evaluate performance, we implemented the object detection architecture and evaluated its performance on the N-Caltech101 dataset. The results showed an accuracy of 53.7 % [email protected] and reached an execution rate of 82 graphs per second.
翻译:近年来,事件相机研究的前沿进展强调以原始稀疏形式处理数据,这使其能够利用高时间分辨率、高动态范围、低延迟及抗图像模糊等独特特性。分析事件数据的一种重要方法是通过图卷积网络。然而,当前该领域的研究主要聚焦于优化计算成本,却忽视了相关的内存成本。本文综合考虑这两个因素,以期在获得满意结果的同时实现相对较低的模型复杂度。为此,我们对不同图卷积操作进行了对比分析,涵盖了执行时间、可训练模型参数数量、数据格式要求以及训练效果等多方面因素。研究结果表明,特征提取模块的参数数量减少了450倍,数据表征规模缩减了4.5倍,同时分类准确率保持在52.3%,较现有最优方法所采用的操作提高了6.3%。为进一步验证性能,我们实现了目标检测架构,并在N-Caltech101数据集上进行了评估。结果显示,该方法的平均精度([email protected])达53.7%,执行速率达每秒82个图。