Autonomous vehicles are equipped with a multi-modal sensor setup to enable the car to drive safely. The initial calibration of such perception sensors is a highly matured topic and is routinely done in an automated factory environment. However, an intriguing question arises on how to maintain the calibration quality throughout the vehicle's operating duration. Another challenge is to calibrate multiple sensors jointly to ensure no propagation of systemic errors. In this paper, we propose CaLiCa, an end-to-end deep self-calibration network which addresses the automatic calibration problem for pinhole camera and Lidar. We jointly predict the camera intrinsic parameters (focal length and distortion) as well as Lidar-Camera extrinsic parameters (rotation and translation), by regressing feature correlation between the camera image and the Lidar point cloud. The network is arranged in a Siamese-twin structure to constrain the network features learning to a mutually shared feature in both point cloud and camera (Lidar-camera constraint). Evaluation using KITTI datasets shows that we achieve 0.154 {\deg} and 0.059 m accuracy with a reprojection error of 0.028 pixel with a single-pass inference. We also provide an ablative study of how our end-to-end learning architecture offers lower terminal loss (21% decrease in rotation loss) compared to isolated calibration
翻译:自动驾驶车辆配备多模态传感器组合以确保安全行驶。这类感知传感器的初始标定技术已高度成熟,通常可在自动化工厂环境中完成。然而,如何在整个车辆运行期间维持标定质量成为关键问题。另一个挑战在于如何联合标定多个传感器,以避免系统误差传播。本文提出CaLiCa——一种端到端深度自标定网络,用于解决针孔相机与激光雷达的自动标定问题。我们通过回归相机图像与激光雷达点云之间的特征相关性,联合预测相机内参(焦距与畸变)及激光雷达-相机外参(旋转与平移)。网络采用孪生结构设计,约束点云与相机的特征学习收敛至共享特征空间(即激光雷达-相机约束)。基于KITTI数据集的评估表明,单次推理可实现0.154°转角精度、0.059米平移精度及0.028像素重投影误差。我们提供的消融研究进一步证明,相比独立标定方法,端到端学习架构可将终端损失降低21%(旋转损失)。