Embedded Graph Convolutional Networks for Real-Time Event Data Processing on SoC FPGAs

from arxiv, Submitted to the IEEE Transactions on Circuits and System for Video Technology. This manuscript was first submitted for publication on March 31, 2024. It has since been revised twice: on May 22, 2024 and June 10, 2024

The utilisation of event cameras represents an important and swiftly evolving trend aimed at addressing the constraints of traditional video systems. Particularly within the automotive domain, these cameras find significant relevance for their integration into embedded real-time systems due to lower latency and energy consumption. One effective approach to ensure the necessary throughput and latency for event processing systems is through the utilisation of graph convolutional networks (GCNs). In this study, we introduce a series of hardware-aware optimisations tailored for PointNet++, a GCN architecture designed for point cloud processing. The proposed techniques result in more than a 100-fold reduction in model size compared to Asynchronous Event-based GNN (AEGNN), one of the most recent works in the field, with a relatively small decrease in accuracy (2.3% for N-Caltech101 classification, 1.7% for N-Cars classification), thus following the TinyML trend. Based on software research, we designed a custom EFGCN (Event-Based FPGA-accelerated Graph Convolutional Network) and we implemented it on ZCU104 SoC FPGA platform, achieving a throughput of 13.3 million events per second (MEPS) and real-time partially asynchronous processing with a latency of 4.47 ms. We also address the scalability of the proposed hardware model to improve the obtained accuracy score. To the best of our knowledge, this study marks the first endeavour in accelerating PointNet++ networks on SoC FPGAs, as well as the first hardware architecture exploration of graph convolutional networks implementation for real-time continuous event data processing. We publish both software and hardware source code in an open repository: https://github.com/vision-agh/*** (will be published upon acceptance).

翻译：事件相机的应用代表了一个重要且快速发展的趋势，旨在解决传统视频系统的局限性。特别是在汽车领域，由于较低的延迟和能耗，这些相机在嵌入式实时系统中的集成具有显著的相关性。确保事件处理系统所需吞吐量和延迟的一种有效方法是利用图卷积网络（GCNs）。在本研究中，我们针对专为点云处理设计的GCN架构PointNet++，引入了一系列硬件感知优化。所提出的技术使得模型大小与领域内最新工作之一——基于异步事件的图神经网络（AEGNN）相比减少了超过100倍，而精度下降相对较小（N-Caltech101分类任务下降2.3%，N-Cars分类任务下降1.7%），从而顺应了TinyML趋势。基于软件研究，我们设计了一个定制的EFGCN（基于事件的FPGA加速图卷积网络），并在ZCU104 SoC FPGA平台上实现了它，达到了每秒1330万事件（MEPS）的吞吐量，以及4.47毫秒延迟的实时部分异步处理能力。我们还探讨了所提出硬件模型的可扩展性，以提升获得的精度分数。据我们所知，本研究标志着在SoC FPGA上加速PointNet++网络的首次尝试，也是针对实时连续事件数据处理的图卷积网络实现的首次硬件架构探索。我们在一个开放仓库中发布了软件和硬件源代码：https://github.com/vision-agh/***（将在论文被接受后发布）。