Online,Target-Free LiDAR-Camera Extrinsic Calibration via Cross-Modal Mask Matching

LiDAR-camera extrinsic calibration (LCEC) is crucial for data fusion in intelligent vehicles. Offline, target-based approaches have long been the preferred choice in this field. However, they often demonstrate poor adaptability to real-world environments. This is largely because extrinsic parameters may change significantly due to moderate shocks or during extended operations in environments with vibrations. In contrast, online, target-free approaches provide greater adaptability yet typically lack robustness, primarily due to the challenges in cross-modal feature matching. Therefore, in this article, we unleash the full potential of large vision models (LVMs), which are emerging as a significant trend in the fields of computer vision and robotics, especially for embodied artificial intelligence, to achieve robust and accurate online, target-free LCEC across a variety of challenging scenarios. Our main contributions are threefold: we introduce a novel framework known as MIAS-LCEC, provide an open-source versatile calibration toolbox with an interactive visualization interface, and publish three real-world datasets captured from various indoor and outdoor environments. The cornerstone of our framework and toolbox is the cross-modal mask matching (C3M) algorithm, developed based on a state-of-the-art (SoTA) LVM and capable of generating sufficient and reliable matches. Extensive experiments conducted on these real-world datasets demonstrate the robustness of our approach and its superior performance compared to SoTA methods, particularly for the solid-state LiDARs with super-wide fields of view.

翻译：LiDAR-相机外参标定（LCEC）对于智能车辆中的数据融合至关重要。长期以来，离线、基于标定物的方法一直是该领域的首选方案。然而，这类方法通常对真实环境的适应性较差，这主要是因为外参可能因中等程度的冲击或在振动环境中的长期运行而发生显著变化。相比之下，在线、无目标的方法具有更好的适应性，但通常缺乏鲁棒性，这主要源于跨模态特征匹配的挑战。因此，本文充分释放大型视觉模型（LVMs）的潜力——该模型正成为计算机视觉与机器人学领域，特别是具身人工智能的重要趋势——以在各种挑战性场景中实现鲁棒且精确的在线、无目标LCEC。我们的主要贡献有三方面：我们提出了一个名为MIAS-LCEC的新型框架，提供了一个具备交互式可视化界面的开源通用标定工具箱，并发布了三个从不同室内外环境采集的真实世界数据集。我们框架与工具箱的核心是跨模态掩码匹配（C3M）算法，该算法基于最先进的LVM开发，能够生成充分且可靠的匹配。在这些真实世界数据集上进行的大量实验证明了我们方法的鲁棒性，及其相较于最先进方法的优越性能，尤其对于具有超宽视场的固态激光雷达而言。