In autonomous driving, LiDAR sensors are vital for acquiring 3D point clouds, providing reliable geometric information. However, traditional sampling methods of preprocessing often ignore semantic features, leading to detail loss and ground point interference in 3D object detection. To address this, we propose a multi-branch two-stage 3D object detection framework using a Semantic-aware Multi-branch Sampling (SMS) module and multi-view consistency constraints. The SMS module includes random sampling, Density Equalization Sampling (DES) for enhancing distant objects, and Ground Abandonment Sampling (GAS) to focus on non-ground points. The sampled multi-view points are processed through a Consistent KeyPoint Selection (CKPS) module to generate consistent keypoint masks for efficient proposal sampling. The first-stage detector uses multi-branch parallel learning with multi-view consistency loss for feature aggregation, while the second-stage detector fuses multi-view data through a Multi-View Fusion Pooling (MVFP) module to precisely predict 3D objects. The experimental results on the KITTI dataset and Waymo Open Dataset show that our method achieves excellent detection performance improvement for a variety of backbones, especially for low-performance backbones with the simple network structures.
翻译:在自动驾驶领域,激光雷达传感器对于获取三维点云至关重要,其能提供可靠的几何信息。然而,传统的预处理采样方法往往忽略语义特征,导致三维目标检测中细节丢失和地面点干扰。为解决此问题,我们提出一种采用语义感知多分支采样模块与多视角一致性约束的多分支两阶段三维目标检测框架。SMS模块包含随机采样、用于增强远距离目标的密度均衡采样,以及专注于非地面点的地面舍弃采样。采样后的多视角点云通过一致性关键点选择模块处理,以生成一致的关键点掩码,从而实现高效的候选区域采样。第一阶段检测器采用多分支并行学习并结合多视角一致性损失进行特征聚合,而第二阶段检测器则通过多视角融合池化模块融合多视角数据,以精确预测三维目标。在KITTI数据集和Waymo Open Dataset上的实验结果表明,我们的方法在各种骨干网络上均实现了优异的检测性能提升,尤其对于网络结构简单的低性能骨干网络效果显著。