Existing Point Cloud Networks (PCNs) have proven to achieve great success in many point cloud tasks such as object part segmentation, shape classification, and so on. The most popular point-based PCNs are usually composed of two sequential steps: Data Structuring (DS) and Feature Computation (FC). In this paper, we first describe an important characteristic of the PCN-specific DS step that has not been addressed in existing PCN accelerators: the spatial locality resulting from overlapping points of the gathered point subsets. Using algorithm-hardware co-design, L-PCN (Locality-aware PCN) proposes two novel techniques to exploit this characteristic to reduce the large amount of repetitive operations in the overall PCN. The first of which is a point cloud partitioning technique, Octree-based Islandization. Using Octree-based adjacency gathering, a point cloud is partitioned into islands in L-PCN, where the point subsets inside the same island exhibit a strong spatial correlation. After partitioning, L-PCN performs the rest of PCN steps at the granularity of islands. The second method of L-PCN is scheduling the intra-island computation with a Hub-based Scheduling to exploit the intra-island data reuse by dynamically caching, updating, and reusing the repeated data. The two methods are implemented in an Islandization Unit, which can be seamlessly integrated into standard PCN workflow. Our evaluation shows that based on our methods for exploiting spatial locality, L-PCN achieves a theoretical reduction in feature fetching ranging from 55.2% to 93.8% and in feature computation ranging from 45.4% to 80.6% during the PCN process. For experimentation, prototype L-PCN accelerators are implemented on the Intel Arria 10 GX FPGA. Experimental results prove that with the Islandization Unit as a plug-in, state-of-the-art PCN accelerators can achieve an additional speedup ranging from 1.2x to 3.2x.
翻译:现有点云网络在物体部件分割、形状分类等多项点云任务中已取得显著成功。最流行的基于点的点云网络通常由两个连续步骤组成:数据结构构建与特征计算。本文首先揭示了现有点云网络加速器未关注的关键特性——由聚集点子集重叠产生的空间局部性。通过算法-硬件协同设计,L-PCN(空间局部性感知点云网络)提出两种新颖技术来利用该特性,以减少整体点云网络中大量重复操作。第一种技术是八叉树岛化点云分割方法:通过八叉树邻域聚集将点云划分为多个岛,同一岛内的点子集呈现强空间相关性;分割后,L-PCN以岛为粒度执行后续点云网络步骤。第二种技术采用基于中心调度的岛内计算调度策略,通过动态缓存、更新与重用重复数据来开发岛内数据复用性。该两种方法集成于岛化单元中,可无缝嵌入标准点云网络工作流。评估表明,基于所提出的空间局部性利用方法,L-PCN在点云网络处理过程中可实现特征提取量理论减少55.2%-93.8%,特征计算量理论减少45.4%-80.6%。实验采用Intel Arria 10 GX FPGA实现L-PCN加速器原型。实验结果证明,通过将岛化单元作为插件集成,现有最先进点云网络加速器可获得1.2倍至3.2倍的额外加速。