Semantic scene segmentation from a bird's-eye-view (BEV) perspective plays a crucial role in facilitating planning and decision-making for mobile robots. Although recent vision-only methods have demonstrated notable advancements in performance, they often struggle under adverse illumination conditions such as rain or nighttime. While active sensors offer a solution to this challenge, the prohibitively high cost of LiDARs remains a limiting factor. Fusing camera data with automotive radars poses a more inexpensive alternative but has received less attention in prior research. In this work, we aim to advance this promising avenue by introducing BEVCar, a novel approach for joint BEV object and map segmentation. The core novelty of our approach lies in first learning a point-based encoding of raw radar data, which is then leveraged to efficiently initialize the lifting of image features into the BEV space. We perform extensive experiments on the nuScenes dataset and demonstrate that BEVCar outperforms the current state of the art. Moreover, we show that incorporating radar information significantly enhances robustness in challenging environmental conditions and improves segmentation performance for distant objects. To foster future research, we provide the weather split of the nuScenes dataset used in our experiments, along with our code and trained models at http://bevcar.cs.uni-freiburg.de.
翻译:鸟瞰视角下的语义场景分割在移动机器人规划与决策中扮演着关键角色。尽管近年来纯视觉方法在性能上取得了显著进展,但往往在雨夜等弱光条件下表现欠佳。虽然主动传感器能够应对这一挑战,但激光雷达的高昂成本仍构成限制因素。将相机数据与车载雷达融合是一种更具性价比的替代方案,但此前研究对此关注不足。本文通过提出BEVCar这一新型联合鸟瞰目标与地图分割方法,致力于推进这一具有前景的研究方向。本方法的核心创新在于:首先学习基于点的原始雷达数据编码,进而利用该编码高效初始化图像特征向鸟瞰空间的升维映射。我们在nuScenes数据集上开展了大量实验,结果表明BEVCar性能超越当前最优方法。此外,我们验证了雷达信息的融入显著增强了复杂环境下的鲁棒性,并提升了远距离目标的分割精度。为促进后续研究,我们公开了实验所用的nuScenes数据集天气划分方案,相关代码与训练模型已发布在http://bevcar.cs.uni-freiburg.de。