LCM: Locally Constrained Compact Point Cloud Model for Masked Point Modeling

The pre-trained point cloud model based on Masked Point Modeling (MPM) has exhibited substantial improvements across various tasks. However, these models heavily rely on the Transformer, leading to quadratic complexity and limited decoder, hindering their practice application. To address this limitation, we first conduct a comprehensive analysis of existing Transformer-based MPM, emphasizing the idea that redundancy reduction is crucial for point cloud analysis. To this end, we propose a Locally constrained Compact point cloud Model (LCM) consisting of a locally constrained compact encoder and a locally constrained Mamba-based decoder. Our encoder replaces self-attention with our local aggregation layers to achieve an elegant balance between performance and efficiency. Considering the varying information density between masked and unmasked patches in the decoder inputs of MPM, we introduce a locally constrained Mamba-based decoder. This decoder ensures linear complexity while maximizing the perception of point cloud geometry information from unmasked patches with higher information density. Extensive experimental results show that our compact model significantly surpasses existing Transformer-based models in both performance and efficiency, especially our LCM-based Point-MAE model, compared to the Transformer-based model, achieved an improvement of 2.24%, 0.87%, and 0.94% in performance on the three variants of ScanObjectNN while reducing parameters by 88% and computation by 73%.

翻译：基于掩码点建模（MPM）的预训练点云模型已在多种任务中展现出显著性能提升。然而，这些模型严重依赖Transformer架构，导致二次复杂度及解码器能力受限，阻碍了其实际应用。为突破此限制，我们首先对现有基于Transformer的MPM方法进行全面分析，强调冗余度降低对点云分析至关重要。为此，我们提出一种局部约束紧凑点云模型（LCM），该模型由局部约束紧凑编码器与基于Mamba的局部约束解码器构成。我们的编码器通过局部聚合层替代自注意力机制，在性能与效率间实现精妙平衡。针对MPM解码器输入中掩码与非掩码区块的信息密度差异，我们引入基于Mamba的局部约束解码器。该解码器在确保线性复杂度的同时，能最大化感知来自信息密度更高的非掩码区块的点云几何信息。大量实验结果表明，我们的紧凑模型在性能与效率上均显著超越现有基于Transformer的模型，特别是我们基于LCM的Point-MAE模型：相较于基于Transformer的模型，在ScanObjectNN的三个变体上性能分别提升2.24%、0.87%和0.94%，同时参数量减少88%，计算量降低73%。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日