A Manhattan world lying along cuboid buildings is useful for camera angle estimation. However, accurate and robust angle estimation from fisheye images in the Manhattan world has remained an open challenge because general scene images tend to lack constraints such as lines, arcs, and vanishing points. To achieve higher accuracy and robustness, we propose a learning-based calibration method that uses heatmap regression, which is similar to pose estimation using keypoints, to detect the directions of labeled image coordinates. Simultaneously, our two estimators recover the rotation and remove fisheye distortion by remapping from a general scene image. Without considering vanishing-point constraints, we find that additional points for learning-based methods can be defined. To compensate for the lack of vanishing points in images, we introduce auxiliary diagonal points that have the optimal 3D arrangement of spatial uniformity. Extensive experiments demonstrated that our method outperforms conventional methods on large-scale datasets and with off-the-shelf cameras.
翻译:曼哈顿世界(即沿长方体建筑分布的场景)对相机角度估计具有重要价值。然而,由于一般场景图像往往缺乏直线、弧线和消失点等约束条件,在曼哈顿世界中从鱼眼图像实现精确且鲁棒的角度估计仍是一个开放挑战。为获得更高的精度与鲁棒性,我们提出一种基于学习的标定方法,该方法采用热图回归(类似于使用关键点进行姿态估计的技术)来检测标注图像坐标的方向。同时,我们的两个估计器通过从一般场景图像重映射,实现了旋转恢复与鱼眼畸变校正。在不考虑消失点约束的情况下,我们发现可以为基于学习的方法定义额外的点。为弥补图像中消失点的缺失,我们引入了具有空间均匀性最优三维排布的辅助对角点。大量实验证明,我们的方法在大规模数据集和商用相机上均优于传统方法。