Point-Voxel Absorbing Graph Representation Learning for Event Stream based Recognition

Considering the balance of performance and efficiency, sampled point and voxel methods are usually employed to down-sample dense events into sparse ones. After that, one popular way is to leverage a graph model which treats the sparse points/voxels as nodes and adopts graph neural networks (GNNs) to learn the representation for event data. Although good performance can be obtained, however, their results are still limited mainly due to two issues. (1) Existing event GNNs generally adopt the additional max (or mean) pooling layer to summarize all node embeddings into a single graph-level representation for the whole event data representation. However, this approach fails to capture the importance of graph nodes and also fails to be fully aware of the node representations. (2) Existing methods generally employ either a sparse point or voxel graph representation model which thus lacks consideration of the complementary between these two types of representation models. To address these issues, in this paper, we propose a novel dual point-voxel absorbing graph representation learning for event stream data representation. To be specific, given the input event stream, we first transform it into the sparse event cloud and voxel grids and build dual absorbing graph models for them respectively. Then, we design a novel absorbing graph convolutional network (AGCN) for our dual absorbing graph representation and learning. The key aspect of the proposed AGCN is its ability to effectively capture the importance of nodes and thus be fully aware of node representations in summarizing all node representations through the introduced absorbing nodes. Finally, the event representations of dual learning branches are concatenated together to extract the complementary information of two cues. The output is then fed into a linear layer for event data classification.

翻译：考虑到性能与效率的平衡，通常采用采样点和体素方法将密集事件降采样为稀疏事件。此后，一种常见方法是利用图模型将稀疏点/体素视为节点，并采用图神经网络（GNN）学习事件数据的表示。尽管能获得良好性能，但现有结果仍受限于两个主要问题：（1）现有事件GNN通常采用额外的最大（或平均）池化层，将所有节点嵌入汇总为单个图级表示以代表整个事件数据。然而，该方法未能捕捉图节点的重要性，也无法充分感知节点表示；（2）现有方法通常仅采用稀疏点或体素图表示模型，因此缺乏对这两种表示模型互补性的考量。为解决这些问题，本文提出一种新颖的双重点-体素吸收图表示学习方法用于事件流数据表示。具体而言，针对输入事件流，首先将其转换为稀疏事件云和体素网格，并分别为二者构建双吸收图模型。随后，我们设计了一种新型吸收图卷积网络（AGCN）用于双吸收图的表示与学习。该AGCN的核心优势在于通过引入吸收节点，能够有效捕捉节点重要性，从而在汇总所有节点表示时充分感知节点特征。最后，将双学习分支的事件表示进行拼接以提取两种线索的互补信息，输出结果输入线性层完成事件数据分类。