Pointy - A Lightweight Transformer for Point Cloud Foundation Models

Foundation models for point cloud data have recently grown in capability, often leveraging extensive representation learning from language or vision. In this work, we take a more controlled approach by introducing a lightweight transformer-based point cloud architecture. In contrast to the heavy reliance on cross-modal supervision, our model is trained only on 39k point clouds - yet it outperforms several larger foundation models trained on over 200k training samples. Interestingly, our method approaches state-of-the-art results from models that have seen over a million point clouds, images, and text samples, demonstrating the value of a carefully curated training setup and architecture. To ensure rigorous evaluation, we conduct a comprehensive replication study that standardizes the training regime and benchmarks across multiple point cloud architectures. This unified experimental framework isolates the impact of architectural choices, allowing for transparent comparisons and highlighting the benefits of our design and other tokenizer-free architectures. Our results show that simple backbones can deliver competitive results to more complex or data-rich strategies. The implementation, including code, pre-trained models, and training protocols, is available at https://github.com/KonradSzafer/Pointy.

翻译：点云数据的基础模型近年来能力不断增强，通常依赖于从语言或视觉中进行广泛的表示学习。在本工作中，我们采用了一种更为可控的方法，引入了一种基于Transformer的轻量级点云架构。与严重依赖跨模态监督不同，我们的模型仅在39k个点云上进行训练——但其性能却优于多个在超过200k训练样本上训练的更大规模基础模型。有趣的是，我们的方法接近那些已见过超过百万个点云、图像和文本样本的模型所达到的最先进结果，这证明了精心设计的训练设置和架构的价值。为确保严谨评估，我们进行了一项全面的复现研究，该研究标准化了训练方案，并在多种点云架构上进行了基准测试。这一统一的实验框架隔离了架构选择的影响，从而实现了透明的比较，并突显了我们设计及其他无分词器架构的优势。我们的结果表明，简单的骨干网络能够取得与更复杂或数据更丰富的策略相竞争的结果。实现细节，包括代码、预训练模型和训练协议，可在 https://github.com/KonradSzafer/Pointy 获取。

相关内容

点云

关注 50

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

稀疏点云感知的表示学习

专知会员服务

9+阅读 · 2月9日

3D点云基础模型：综述与展望

专知会员服务

17+阅读 · 2025年1月31日

【牛津大学博士论文】学习理解大规模3D点云，191页pdf

专知会员服务

38+阅读 · 2023年6月22日

复旦等最新《预训练3D点云的自监督学习》综述

专知会员服务

31+阅读 · 2023年5月10日