One of the main challenges in LiDAR-based 3D object detection is that the sensors often fail to capture the complete spatial information about the objects due to long distance and occlusion. Two-stage detectors with point cloud completion approaches tackle this problem by adding more points to the regions of interest (RoIs) with a pre-trained network. However, these methods generate dense point clouds of objects for all region proposals, assuming that objects always exist in the RoIs. This leads to the indiscriminate point generation for incorrect proposals as well. Motivated by this, we propose Point Generation R-CNN (PG-RCNN), a novel end-to-end detector that generates semantic surface points of foreground objects for accurate detection. Our method uses a jointly trained RoI point generation module to process the contextual information of RoIs and estimate the complete shape and displacement of foreground objects. For every generated point, PG-RCNN assigns a semantic feature that indicates the estimated foreground probability. Extensive experiments show that the point clouds generated by our method provide geometrically and semantically rich information for refining false positive and misaligned proposals. PG-RCNN achieves competitive performance on the KITTI benchmark, with significantly fewer parameters than state-of-the-art models. The code is available at https://github.com/quotation2520/PG-RCNN.
翻译:基于激光雷达的3D物体检测面临的主要挑战之一在于,由于远距离和遮挡,传感器往往无法捕获物体的完整空间信息。采用点云补全方法的两阶段检测器通过预训练网络为感兴趣区域(RoIs)添加更多点来解决此问题。然而,这些方法为所有区域提议生成密集的物体点云,而假设RoIs中始终存在物体。这也导致了针对错误提议的无差别点生成。受此启发,我们提出了Point Generation R-CNN(PG-RCNN),这是一种新颖的端到端检测器,可生成前景物体的语义表面点以实现精确检测。我们的方法使用联合训练的RoI点生成模块来处理RoI的上下文信息,并估计前景物体的完整形状和位移。对于每个生成的点,PG-RCNN会分配一个指示预估前景概率的语义特征。大量实验表明,我们方法生成的点云可为修正假阳性和错位提议提供几何及语义上的丰富信息。PG-RCNN在KITTI基准测试上取得了具有竞争力的性能,且参数量显著少于现有最优模型。代码已开源:https://github.com/quotation2520/PG-RCNN。