Monocular depth estimation is a critical task for autonomous driving and many other computer vision applications. While significant progress has been made in this field, the effects of viewpoint shifts on depth estimation models remain largely underexplored. This paper introduces a novel dataset and evaluation methodology to quantify the impact of different camera positions and orientations on monocular depth estimation performance. We propose a ground truth strategy based on homography estimation and object detection, eliminating the need for expensive lidar sensors. We collect a diverse dataset of road scenes from multiple viewpoints and use it to assess the robustness of a modern depth estimation model to geometric shifts. After assessing the validity of our strategy on a public dataset, we provide valuable insights into the limitations of current models and highlight the importance of considering viewpoint variations in real-world applications.
翻译:单目深度估计是自动驾驶及众多计算机视觉应用中的关键任务。尽管该领域已取得显著进展,但视角偏移对深度估计模型的影响仍未得到充分探究。本文提出一种新颖的数据集与评估方法,用于量化不同相机位置与朝向对单目深度估计性能的影响。我们提出一种基于单应性估计与目标检测的真值生成策略,无需昂贵的激光雷达传感器。我们采集了多视角道路场景的多样化数据集,并利用其评估现代深度估计模型对几何偏移的鲁棒性。通过在公开数据集上验证本策略的有效性后,我们揭示了当前模型的局限性,并强调了在实际应用中考虑视角变化的重要性。