APPT : Asymmetric Parallel Point Transformer for 3D Point Cloud Understanding

Transformer-based networks have achieved impressive performance in 3D point cloud understanding. However, most of them concentrate on aggregating local features, but neglect to directly model global dependencies, which results in a limited effective receptive field. Besides, how to effectively incorporate local and global components also remains challenging. To tackle these problems, we propose Asymmetric Parallel Point Transformer (APPT). Specifically, we introduce Global Pivot Attention to extract global features and enlarge the effective receptive field. Moreover, we design the Asymmetric Parallel structure to effectively integrate local and global information. Combined with these designs, APPT is able to capture features globally throughout the entire network while focusing on local-detailed features. Extensive experiments show that our method outperforms the priors and achieves state-of-the-art on several benchmarks for 3D point cloud understanding, such as 3D semantic segmentation on S3DIS, 3D shape classification on ModelNet40, and 3D part segmentation on ShapeNet.

翻译：基于Transformer的网络在三维点云理解任务中已取得显著性能。然而，多数方法集中于局部特征聚合，忽略了全局依赖关系的直接建模，导致有效感受野受限。此外，如何有效融合局部与全局成分仍具挑战性。为解决这些问题，我们提出非对称并行点Transformer（APPT）。具体而言，我们引入全局枢轴注意力机制（Global Pivot Attention）以提取全局特征并扩大有效感受野。同时，我们设计非对称并行结构（Asymmetric Parallel structure）来有效整合局部与全局信息。结合这些设计，APPT能够在整个网络中实现全局特征捕获，同时关注局部细节特征。大量实验表明，在三维点云理解的多个基准任务（如S3DIS数据集上的三维语义分割、ModelNet40数据集上的三维形状分类、ShapeNet数据集上的三维部件分割）中，本方法优于现有模型并达到最先进性能。

相关内容

点云

关注 0

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

【RecSys22教程】多阶段推荐系统的神经重排序，90页ppt

专知会员服务

27+阅读 · 2022年9月30日

【ICLR2022】UniFormer：无缝集成 Transformer，更高效的时空表征学习框架

专知会员服务

50+阅读 · 2022年2月16日

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

最新《Transformers模型》教程，64页ppt

专知会员服务

326+阅读 · 2020年11月26日