SMURF: Spatial Multi-Representation Fusion for 3D Object Detection with 4D Imaging Radar

The 4D Millimeter wave (mmWave) radar is a promising technology for vehicle sensing due to its cost-effectiveness and operability in adverse weather conditions. However, the adoption of this technology has been hindered by sparsity and noise issues in radar point cloud data. This paper introduces spatial multi-representation fusion (SMURF), a novel approach to 3D object detection using a single 4D imaging radar. SMURF leverages multiple representations of radar detection points, including pillarization and density features of a multi-dimensional Gaussian mixture distribution through kernel density estimation (KDE). KDE effectively mitigates measurement inaccuracy caused by limited angular resolution and multi-path propagation of radar signals. Additionally, KDE helps alleviate point cloud sparsity by capturing density features. Experimental evaluations on View-of-Delft (VoD) and TJ4DRadSet datasets demonstrate the effectiveness and generalization ability of SMURF, outperforming recently proposed 4D imaging radar-based single-representation models. Moreover, while using 4D imaging radar only, SMURF still achieves comparable performance to the state-of-the-art 4D imaging radar and camera fusion-based method, with an increase of 1.22% in the mean average precision on bird's-eye view of TJ4DRadSet dataset and 1.32% in the 3D mean average precision on the entire annotated area of VoD dataset. Our proposed method demonstrates impressive inference time and addresses the challenges of real-time detection, with the inference time no more than 0.05 seconds for most scans on both datasets. This research highlights the benefits of 4D mmWave radar and is a strong benchmark for subsequent works regarding 3D object detection with 4D imaging radar.

翻译：4D毫米波雷达因其成本效益和在恶劣天气条件下的可操作性，成为车辆感知领域颇具前景的技术。然而，该技术的推广应用受到雷达点云数据稀疏性与噪声问题的制约。本文提出了一种新颖的基于单4D成像雷达的三维目标检测方法——空间多表示融合（SMURF）。SMURF利用雷达检测点的多种表示，包括通过核密度估计（KDE）对多维高斯混合分布进行柱状化处理与密度特征提取。KDE有效缓解了由雷达信号有限角分辨率与多径传播导致的测量不准确性，同时通过捕获密度特征有助于缓解点云稀疏性问题。在View-of-Delft（VoD）与TJ4DRadSet数据集上的实验评估表明，SMURF具备出色的有效性与泛化能力，性能超越近期提出的基于4D成像雷达的单表示模型。此外，即便仅使用4D成像雷达，SMURF仍能达到与最先进的4D成像雷达与相机融合方法相媲美的性能：在TJ4DRadSet数据集的鸟瞰图平均精度上提升1.22%，在VoD数据集完整标注区域的3D平均精度上提升1.32%。所提方法展现出卓越的推理时间，并解决了实时检测的挑战，在两个数据集上大多数扫描序列的推理时间不超过0.05秒。本研究凸显了4D毫米波雷达的优势，为后续基于4D成像雷达的三维目标检测工作提供了有力的基准参考。