The unstructured nature of point clouds demands that local aggregation be adaptive to different local structures. Previous methods meet this by explicitly embedding spatial relations into each aggregation process. Although this coupled approach has been shown effective in generating clear semantics, aggregation can be greatly slowed down due to repeated relation learning and redundant computation to mix directional and point features. In this work, we propose to decouple the explicit modelling of spatial relations from local aggregation. We theoretically prove that basic neighbor pooling operations can too function without loss of clarity in feature fusion, so long as essential spatial information has been encoded in point features. As an instantiation of decoupled local aggregation, we present DeLA, a lightweight point network, where in each learning stage relative spatial encodings are first formed, and only pointwise convolutions plus edge max-pooling are used for local aggregation then. Further, a regularization term is employed to reduce potential ambiguity through the prediction of relative coordinates. Conceptually simple though, experimental results on five classic benchmarks demonstrate that DeLA achieves state-of-the-art performance with reduced or comparable latency. Specifically, DeLA achieves over 90\% overall accuracy on ScanObjectNN and 74\% mIoU on S3DIS Area 5. Our code is available at https://github.com/Matrix-ASC/DeLA .
翻译:点云的无结构化特性要求局部聚合操作能够自适应不同的局部结构。现有方法通过在每次聚合过程中显式嵌入空间关系来满足这一需求。尽管这种耦合方法在生成清晰语义方面已被证明有效,但由于反复的关系学习以及混合方向特征与点特征的冗余计算,聚合速度会显著降低。本文提出将空间关系的显式建模与局部聚合解耦。我们从理论上证明,只要已编码必要的空间信息于点特征中,基本的邻居池化操作同样能在不损失特征融合清晰度的前提下发挥作用。作为解耦式局部聚合的具体实现,我们提出了轻量级点网络DeLA。在该网络中,每个学习阶段首先形成相对空间编码,随后仅使用逐点卷积和边缘最大池化进行局部聚合。此外,通过引入正则化项来预测相对坐标以减少潜在歧义。尽管概念简单,在五个经典基准上的实验结果表明,DeLA以更低或相当的计算延迟达到了最先进性能。具体而言,DeLA在ScanObjectNN上实现了超过90%的整体准确率,在S3DIS Area 5上达到了74%的mIoU。我们的代码开源于https://github.com/Matrix-ASC/DeLA。