Vector maps are essential in autonomous driving for tasks like localization and planning, yet their creation and maintenance are notably costly. While recent advances in online vector map generation for autonomous vehicles are promising, current models lack adaptability to different sensor configurations. They tend to overfit to specific sensor poses, leading to decreased performance and higher retraining costs. This limitation hampers their practical use in real-world applications. In response to this challenge, we propose a modular pipeline for vector map generation with improved generalization to sensor configurations. The pipeline leverages probabilistic semantic mapping to generate a bird's-eye-view (BEV) semantic map as an intermediate representation. This intermediate representation is then converted to a vector map using the MapTRv2 decoder. By adopting a BEV semantic map robust to different sensor configurations, our proposed approach significantly improves the generalization performance. We evaluate the model on datasets with sensor configurations not used during training. Our evaluation sets includes larger public datasets, and smaller scale private data collected on our platform. Our model generalizes significantly better than the state-of-the-art methods.
翻译:矢量地图在自动驾驶的定位与规划等任务中至关重要,但其创建与维护成本显著高昂。尽管近期在线矢量地图生成技术在自动驾驶领域取得了令人瞩目的进展,现有模型仍缺乏对不同传感器配置的适应性。这些模型易过度拟合特定传感器位姿,导致性能下降及重训练成本增加。这一局限性阻碍了其在实际应用中的实用价值。为应对该挑战,我们提出一种模块化矢量地图生成管线,能够增强对传感器配置的泛化能力。该管线利用概率语义映射生成鸟瞰图(BEV)语义地图作为中间表征,并采用MapTRv2解码器将此中间表征转化为矢量地图。通过采用对传感器配置具有鲁棒性的BEV语义地图,所提方法显著提升了泛化性能。我们基于训练阶段未使用的传感器配置数据集对模型进行评估,评估集涵盖更大规模的公开数据集及我们平台上采集的小规模私有数据。实验表明,该模型的泛化性能显著优于现有最优方法。