Semantic scene segmentation from a bird's-eye-view (BEV) perspective plays a crucial role in facilitating planning and decision-making for mobile robots. Although recent vision-only methods have demonstrated notable advancements in performance, they often struggle under adverse illumination conditions such as rain or nighttime. While active sensors offer a solution to this challenge, the prohibitively high cost of LiDARs remains a limiting factor. Fusing camera data with automotive radars poses a more inexpensive alternative but has received less attention in prior research. In this work, we aim to advance this promising avenue by introducing BEVCar, a novel approach for joint BEV object and map segmentation. The core novelty of our approach lies in first learning a point-based encoding of raw radar data, which is then leveraged to efficiently initialize the lifting of image features into the BEV space. We perform extensive experiments on the nuScenes dataset and demonstrate that BEVCar outperforms the current state of the art. Moreover, we show that incorporating radar information significantly enhances robustness in challenging environmental conditions and improves segmentation performance for distant objects. To foster future research, we provide the weather split of the nuScenes dataset used in our experiments, along with our code and trained models at http://bevcar.cs.uni-freiburg.de.
翻译:从鸟瞰视角进行语义场景分割在促进移动机器人规划与决策方面起着至关重要的作用。尽管近期纯视觉方法在性能上已展现出显著进步,但在雨夜等恶劣光照条件下,其表现往往不佳。虽然主动传感器为解决这一挑战提供了方案,但激光雷达的过高成本仍是限制因素。将相机数据与车载雷达融合提供了一种更为经济的替代方案,但在先前研究中受到的关注较少。在本工作中,我们旨在通过引入BEVCar——一种新颖的联合BEV目标与地图分割方法,来推进这一前景广阔的路径。我们方法的核心创新在于首先学习原始雷达数据的基于点的编码,随后利用该编码高效地初始化图像特征向BEV空间的提升。我们在nuScenes数据集上进行了大量实验,并证明BEVCar超越了当前最先进的方法。此外,我们表明融合雷达信息能显著增强在挑战性环境条件下的鲁棒性,并提升对远处物体的分割性能。为促进未来研究,我们提供了实验中使用的nuScenes数据集天气划分,以及我们的代码和训练模型,详见 http://bevcar.cs.uni-freiburg.de。