Automotive perception systems are obligated to meet high requirements. While optical sensors such as Camera and Lidar struggle in adverse weather conditions, Radar provides a more robust perception performance, effectively penetrating fog, rain, and snow. Since full Radar tensors have large data sizes and very few datasets provide them, most Radar-based approaches work with sparse point clouds or 2D projections, which can result in information loss. Additionally, deep learning methods show potential to extract richer and more dense features from low level Radar data and therefore significantly increase the perception performance. Therefore, we propose a 3D projection method for fast-Fourier-transformed 4D Range-Azimuth-Doppler-Elevation (RADE) tensors. Our method preserves rich Doppler and Elevation features while reducing the required data size for a single frame by 91.9% compared to a full tensor, thus achieving higher training and inference speed as well as lower model complexity. We introduce RADE-Net, a lightweight model tailored to 3D projections of the RADE tensor. The backbone enables exploitation of low-level and high-level cues of Radar tensors with spatial and channel-attention. The decoupled detection heads predict object center-points directly in the Range-Azimuth domain and regress rotated 3D bounding boxes from rich feature maps in the cartesian scene. We evaluate the model on scenes with multiple different road users and under various weather conditions on the large-scale K-Radar dataset and achieve a 16.7% improvement compared to their baseline, as well as 6.5% improvement over current Radar-only models. Additionally, we outperform several Lidar approaches in scenarios with adverse weather conditions. The code is available under https://github.com/chr-is-tof/RADE-Net.
翻译:汽车感知系统必须满足高标准要求。在恶劣天气条件下,相机和激光雷达等光学传感器性能受限,而雷达凭借其有效穿透雾、雨、雪的能力,能提供更稳健的感知性能。由于完整雷达张量数据规模庞大且公开数据集极少提供,现有雷达感知方法大多基于稀疏点云或二维投影,这可能导致信息损失。此外,深度学习方法展现出从底层雷达数据中提取更丰富、更密集特征的潜力,从而能显著提升感知性能。为此,我们提出一种针对快速傅里叶变换后四维距离-方位-多普勒-俯仰(RADE)张量的三维投影方法。该方法在保留丰富多普勒与俯仰特征的同时,将单帧数据量较完整张量减少91.9%,从而获得更高的训练推理速度与更低的模型复杂度。我们提出RADE-Net——专为RADE张量三维投影设计的轻量化模型。其主干网络通过空间与通道注意力机制,能够同时利用雷达张量的底层与高层特征线索。解耦检测头直接在距离-方位域预测目标中心点,并在笛卡尔坐标系中基于丰富特征图回归旋转三维边界框。我们在包含多类道路使用者及多种天气条件的大规模K-Radar数据集上评估模型性能:相较基准方法提升16.7%,较现有纯雷达模型提升6.5%。此外,在恶劣天气场景中,本方法性能优于多种激光雷达方案。代码发布于https://github.com/chr-is-tof/RADE-Net。