Sensor fusion is crucial for an accurate and robust perception system on autonomous vehicles. Most existing datasets and perception solutions focus on fusing cameras and LiDAR. However, the collaboration between camera and radar is significantly under-exploited. The incorporation of rich semantic information from the camera, and reliable 3D information from the radar can potentially achieve an efficient, cheap, and portable solution for 3D object perception tasks. It can also be robust to different lighting or all-weather driving scenarios due to the capability of mmWave radars. In this paper, we introduce the CRUW3D dataset, including 66K synchronized and well-calibrated camera, radar, and LiDAR frames in various driving scenarios. Unlike other large-scale autonomous driving datasets, our radar data is in the format of radio frequency (RF) tensors that contain not only 3D location information but also spatio-temporal semantic information. This kind of radar format can enable machine learning models to generate more reliable object perception results after interacting and fusing the information or features between the camera and radar.
翻译:传感器融合对于自动驾驶汽车实现精确且鲁棒的感知系统至关重要。现有的大多数数据集和感知解决方案侧重于融合相机与激光雷达,然而,相机与雷达之间的协同作用却明显未被充分挖掘。相机提供的丰富语义信息与雷达提供的可靠三维信息相结合,有望为三维目标感知任务提供高效、低成本且便携的解决方案。由于毫米波雷达具有全天候工作能力,该方案还能在不同光照条件或全气候驾驶场景下保持鲁棒性。本文介绍了CRUW3D数据集,包含在各种驾驶场景下采集的6.6万帧同步且标定良好的相机、雷达和激光雷达数据。与其他大规模自动驾驶数据集不同,我们的雷达数据采用射频张量格式,不仅包含三维位置信息,还蕴含时空语义信息。这种雷达格式能促使机器学习模型在相机与雷达间的信息或特征交互融合后,生成更可靠的目标感知结果。