InSpaceType: Reconsider Space Type in Indoor Monocular Depth Estimation

Indoor monocular depth estimation has attracted increasing research interest. Most previous works have been focusing on methodology, primarily experimenting with NYU-Depth-V2 (NYUv2) Dataset, and only concentrated on the overall performance over the test set. However, little is known regarding robustness and generalization when it comes to applying monocular depth estimation methods to real-world scenarios where highly varying and diverse functional \textit{space types} are present such as library or kitchen. A study for performance breakdown into space types is essential to realize a pretrained model's performance variance. To facilitate our investigation for robustness and address limitations of previous works, we collect InSpaceType, a high-quality and high-resolution RGBD dataset for general indoor environments. We benchmark 12 recent methods on InSpaceType and find they severely suffer from performance imbalance concerning space types, which reveals their underlying bias. We extend our analysis to 4 other datasets, 3 mitigation approaches, and the ability to generalize to unseen space types. Our work marks the first in-depth investigation of performance imbalance across space types for indoor monocular depth estimation, drawing attention to potential safety concerns for model deployment without considering space types, and further shedding light on potential ways to improve robustness. See \url{https://depthcomputation.github.io/DepthPublic} for data and the supplementary document. The benchmark list on the GitHub project page keeps updates for the lastest monocular depth estimation methods.

翻译：室内单目深度估计已引起越来越多的研究兴趣。以往研究大多聚焦于方法本身，主要基于NYU-Depth-V2（NYUv2）数据集进行实验，且仅关注测试集上的整体性能。然而，当将单目深度估计方法应用于包含高度多样化功能空间类型（如图书馆或厨房）的真实场景时，其鲁棒性和泛化能力却鲜有研究。为揭示预训练模型在不同空间类型上的性能差异，对性能进行空间类型的精细化分解至关重要。为促进鲁棒性研究并弥补前人工作的不足，我们构建了InSpaceType——一个面向通用室内环境的高质量、高分辨率RGBD数据集。通过对12种最新方法在InSpaceType上的基准测试，我们发现这些方法在空间类型上存在严重的性能不平衡，揭示了其潜在的偏差。我们将分析扩展至另外4个数据集、3种缓解方法，并考察了模型对未见空间类型的泛化能力。本研究首次深入探究了室内单目深度估计中空间类型导致的性能不平衡问题，提醒关注不考虑空间类型的模型部署可能带来的安全隐患，并为提升模型鲁棒性指明了潜在方向。数据和补充文档参见\url{https://depthcomputation.github.io/DepthPublic}。GitHub项目页面上的基准测试列表将持续更新最新的单目深度估计方法。