The recent surge in 3D data acquisition has spurred the development of geometric deep learning models for point cloud processing, boosted by the remarkable success of transformers in natural language processing. While point cloud transformers (PTs) have achieved impressive results recently, their quadratic scaling with respect to the point cloud size poses a significant scalability challenge for real-world applications. To address this issue, we propose the Adaptive Point Cloud Transformer (AdaPT), a standard PT model augmented by an adaptive token selection mechanism. AdaPT dynamically reduces the number of tokens during inference, enabling efficient processing of large point clouds. Furthermore, we introduce a budget mechanism to flexibly adjust the computational cost of the model at inference time without the need for retraining or fine-tuning separate models. Our extensive experimental evaluation on point cloud classification tasks demonstrates that AdaPT significantly reduces computational complexity while maintaining competitive accuracy compared to standard PTs. The code for AdaPT is made publicly available.
翻译:近年来,3D数据采集的激增推动了用于点云处理的几何深度学习模型的发展,而Transformer在自然语言处理领域取得的显著成功进一步加速了这一进程。尽管点云Transformer(PT)近期取得了令人瞩目的成果,但其计算复杂度随点云规模呈二次方增长,为实际应用带来了严峻的可扩展性挑战。为解决这一问题,我们提出自适应点云Transformer(AdaPT),这是一种通过引入自适应令牌选择机制增强的标准PT模型。AdaPT在推理过程中动态减少令牌数量,从而实现对大规模点云的高效处理。此外,我们引入预算机制,可在无需重新训练或微调独立模型的前提下灵活调节推理时的计算成本。在点云分类任务上的广泛实验评估表明,与标准PT相比,AdaPT在保持竞争性精度的同时显著降低了计算复杂度。AdaPT的代码已公开发布。