Depth estimation from monocular images is pivotal for real-world visual perception systems. While current learning-based depth estimation models train and test on meticulously curated data, they often overlook out-of-distribution (OoD) situations. Yet, in practical settings -- especially safety-critical ones like autonomous driving -- common corruptions can arise. Addressing this oversight, we introduce a comprehensive robustness test suite, RoboDepth, encompassing 18 corruptions spanning three categories: i) weather and lighting conditions; ii) sensor failures and movement; and iii) data processing anomalies. We subsequently benchmark 42 depth estimation models across indoor and outdoor scenes to assess their resilience to these corruptions. Our findings underscore that, in the absence of a dedicated robustness evaluation framework, many leading depth estimation models may be susceptible to typical corruptions. We delve into design considerations for crafting more robust depth estimation models, touching upon pre-training, augmentation, modality, model capacity, and learning paradigms. We anticipate our benchmark will establish a foundational platform for advancing robust OoD depth estimation.
翻译:单目图像的深度估计对于真实世界的视觉感知系统至关重要。当前基于学习的深度估计模型虽然在精心整理的数据上进行训练和测试,但往往忽略了分布外(OoD)场景。然而,在实际应用中——尤其是像自动驾驶这样的安全关键领域——常见的图像损坏可能发生。为了解决这一疏漏,我们引入了一套全面的鲁棒性测试套件RoboDepth,涵盖三大类共18种损坏:i) 天气与光照条件;ii) 传感器故障与运动;iii) 数据处理异常。随后,我们对室内和室外场景下的42个深度估计模型进行了基准测试,以评估它们对这些损坏的抵抗能力。研究结果表明,在缺乏专门鲁棒性评估框架的情况下,许多领先的深度估计模型可能容易受到典型损坏的影响。我们深入探讨了构建更鲁棒的深度估计模型的设计考量,涉及预训练、数据增强、模态、模型容量以及学习范式。我们预期,我们的基准测试将为推动鲁棒的分布外深度估计研究奠定基础平台。