Autonomous driving faces safety challenges due to a lack of global perspective and the semantic information of vectorized high-definition (HD) maps. Information from roadside cameras can greatly expand the map perception range through vehicle-to-infrastructure (V2I) communications. However, there is still no dataset from the real world available for the study on map vectorization onboard under the scenario of vehicle-infrastructure cooperation. To prosper the research on online HD mapping for Vehicle-Infrastructure Cooperative Autonomous Driving (VICAD), we release a real-world dataset, which contains collaborative camera frames from both vehicles and roadside infrastructures, and provides human annotations of HD map elements. We also present an end-to-end neural framework (i.e., V2I-HD) leveraging vision-centric V2I systems to construct vectorized maps. To reduce computation costs and further deploy V2I-HD on autonomous vehicles, we introduce a directionally decoupled self-attention mechanism to V2I-HD. Extensive experiments show that V2I-HD has superior performance in real-time inference speed, as tested by our real-world dataset. Abundant qualitative results also demonstrate stable and robust map construction quality with low cost in complex and various driving scenes. As a benchmark, both source codes and the dataset have been released at OneDrive for the purpose of further study.
翻译:自动驾驶因缺乏全局视角和矢量化高精地图的语义信息而面临安全挑战。路侧摄像头信息通过车路协同通信可极大扩展地图感知范围。然而,在车路协同场景下,目前仍缺乏可用于车载地图矢量化研究的真实世界数据集。为促进车路协同自动驾驶在线高精地图研究,我们发布了一个真实世界数据集,该数据集包含车路协同摄像头帧序列,并提供高精地图要素的人工标注。我们还提出了一种端到端神经框架(即V2I-HD),利用视觉中心的车路协同系统构建矢量化地图。为降低计算成本并将V2I-HD部署于自动驾驶车辆,我们为V2I-HD引入了方向解耦自注意力机制。大量实验表明,经真实世界数据集测试,V2I-HD在实时推理速度方面具有优越性能。丰富的定性结果也证明,在复杂多变的驾驶场景中,该系统能以较低成本实现稳定鲁棒的地图构建质量。作为基准,源代码与数据集均已发布于OneDrive以供进一步研究。