In this work, we propose StereoGeo, an end-to-end network-based approach for stereo camera calibration. Our method estimates the focal lengths and gravity directions of the left and right cameras, as well as the relative extrinsic transformation relating them. Existing methods often rely on calibration patterns in structured environments or address only a single camera configuration, being limited to either intrinsic or extrinsic estimation, and depending on a multi-view setups. StereoGeo extends the GeoCalib algorithm, integrating deep neural network feature extraction with a differentiable optimizer. Extensive experiments on real-world benchmarks demonstrate that StereoGeo achieves competitive performance for intrinsic calibration and provides accurate stereo extrinsic estimation, outperforming existing methods that are limited to monocular settings. The dataset used in this work is partially publicly available at https://github.com/meddourimane/StereoGeo-dataset.
翻译:本文提出了一种基于端到端网络的立体相机标定方法StereoGeo。该方法能够估计左右相机的焦距与重力方向,以及两者之间的相对外参变换关系。现有方法通常依赖于结构化环境中的标定板,或仅针对单相机配置,局限于内参或外参的估计,并且依赖于多视角设置。StereoGeo扩展了GeoCalib算法,将深度神经网络特征提取与可微分优化器相结合。在真实世界基准上的大量实验表明,StereoGeo在内参标定方面取得了具有竞争力的性能,并能提供精确的立体外参估计,优于现有仅能处理单目设置的方法。本文使用的数据集部分公开于https://github.com/meddourimane/StereoGeo-dataset。