Monocular depth estimation (MDE) is a fundamental topic of geometric computer vision and a core technique for many downstream applications. Recently, several methods reframe the MDE as a classification-regression problem where a linear combination of probabilistic distribution and bin centers is used to predict depth. In this paper, we propose a novel concept of iterative elastic bins (IEBins) for the classification-regression-based MDE. The proposed IEBins aims to search for high-quality depth by progressively optimizing the search range, which involves multiple stages and each stage performs a finer-grained depth search in the target bin on top of its previous stage. To alleviate the possible error accumulation during the iterative process, we utilize a novel elastic target bin to replace the original target bin, the width of which is adjusted elastically based on the depth uncertainty. Furthermore, we develop a dedicated framework composed of a feature extractor and an iterative optimizer that has powerful temporal context modeling capabilities benefiting from the GRU-based architecture. Extensive experiments on the KITTI, NYU-Depth-v2 and SUN RGB-D datasets demonstrate that the proposed method surpasses prior state-of-the-art competitors. The source code is publicly available at https://github.com/ShuweiShao/IEBins.
翻译:单目深度估计(MDE)是几何计算机视觉的基本课题,也是许多下游应用的核心技术。近年来,多种方法将MDE重构为分类-回归问题,通过概率分布与箱中心的线性组合来预测深度。本文针对基于分类-回归的MDE,提出迭代弹性箱(IEBins)这一新概念。所提出的IEBins通过逐步优化搜索范围来寻找高质量深度,该过程包含多个阶段,每个阶段在上一阶段的基础上对目标箱进行更细粒度的深度搜索。为缓解迭代过程中可能出现的误差累积,我们采用新型弹性目标箱替代原始目标箱,其宽度可根据深度不确定性进行弹性调整。此外,我们开发了由特征提取器和迭代优化器组成的专用框架,该框架得益于基于GRU的架构而具备强大的时序上下文建模能力。在KITTI、NYU-Depth-v2和SUN RGB-D数据集上的大量实验表明,所提方法超越了先前最先进的竞争者。源代码已公开于https://github.com/ShuweiShao/IEBins。