While point-based neural architectures have demonstrated their efficacy, the time-consuming sampler currently prevents them from performing real-time reasoning on scene-level point clouds. Existing methods attempt to overcome this issue by using random sampling strategy instead of the commonly-adopted farthest point sampling~(FPS), but at the expense of lower performance. So the effectiveness/efficiency trade-off remains under-explored. In this paper, we reveal the key to high-quality sampling is ensuring an even spacing between points in the subset, which can be naturally obtained through a grid. Based on this insight, we propose a hierarchical adaptive voxel-guided point sampler with linear complexity and high parallelization for real-time applications. Extensive experiments on large-scale point cloud detection and segmentation tasks demonstrate that our method achieves competitive performance with the most powerful FPS, at an amazing speed that is more than 100 times faster. This breakthrough in efficiency addresses the bottleneck of the sampling step when handling scene-level point clouds. Furthermore, our sampler can be easily integrated into existing models and achieves a 20$\sim$80\% reduction in runtime with minimal effort. The code will be available at https://github.com/OuyangJunyuan/pointcloud-3d-detector-tensorrt
翻译:尽管基于点的神经架构已展现出有效性,但当前耗时采样器仍阻碍其在场景级点云上实现实时推理。现有方法尝试采用随机采样策略替代广泛使用的FPS(最远点采样)以解决该问题,但代价是性能下降。因此,效率与效果的平衡问题仍有待探索。本文揭示了高质量采样的关键在于确保子集中点间距的均匀性,这一特性可通过网格自然实现。基于此洞见,我们提出一种具有线性复杂度与高并行性的分层自适应体素引导点采样器,适用于实时应用。在大规模点云检测与分割任务上的大量实验表明,我们的方法在实现超100倍加速的同时,性能可与最强大的FPS相媲美。这一效率突破解决了处理场景级点云时采样步骤的瓶颈问题。此外,该采样器可轻松集成至现有模型,仅需极小改动即可缩短20%至80%的运行时间。代码将开源至https://github.com/OuyangJunyuan/pointcloud-3d-detector-tensorrt