Scene understanding plays an essential role in enabling autonomous driving and maintaining high standards of performance and safety. To address this task, cameras and laser scanners (LiDARs) have been the most commonly used sensors, with radars being less popular. Despite that, radars remain low-cost, information-dense, and fast-sensing techniques that are resistant to adverse weather conditions. While multiple works have been previously presented for radar-based scene semantic segmentation, the nature of the radar data still poses a challenge due to the inherent noise and sparsity, as well as the disproportionate foreground and background. In this work, we propose a novel approach to the semantic segmentation of radar scenes using a multi-input fusion of radar data through a novel architecture and loss functions that are tailored to tackle the drawbacks of radar perception. Our novel architecture includes an efficient attention block that adaptively captures important feature information. Our method, TransRadar, outperforms state-of-the-art methods on the CARRADA and RADIal datasets while having smaller model sizes. https://github.com/YahiDar/TransRadar
翻译:场景理解在实现自动驾驶及维持高性能与安全标准中扮演关键角色。为应对该任务,摄像头与激光扫描仪(LiDAR)是最常用的传感器,而雷达的普及度较低。尽管如此,雷达仍具有低成本、信息密集、快速感知且抗恶劣天气的优势。虽然已有多种基于雷达的场景语义分割方法被提出,但雷达数据固有的噪声与稀疏性,以及前后景比例失衡问题,仍对算法构成挑战。本文提出一种新颖的雷达场景语义分割方法,通过多输入融合雷达数据、新颖架构及针对雷达感知缺陷设计的损失函数实现。我们的架构包含一个高效注意力模块,可自适应捕捉重要特征信息。所提方法TransRadar在CARRADA与RADIal数据集上以更小模型尺寸超越现有最优方法。https://github.com/YahiDar/TransRadar