We introduce VA-DepthNet, a simple, effective, and accurate deep neural network approach for the single-image depth prediction (SIDP) problem. The proposed approach advocates using classical first-order variational constraints for this problem. While state-of-the-art deep neural network methods for SIDP learn the scene depth from images in a supervised setting, they often overlook the invaluable invariances and priors in the rigid scene space, such as the regularity of the scene. The paper's main contribution is to reveal the benefit of classical and well-founded variational constraints in the neural network design for the SIDP task. It is shown that imposing first-order variational constraints in the scene space together with popular encoder-decoder-based network architecture design provides excellent results for the supervised SIDP task. The imposed first-order variational constraint makes the network aware of the depth gradient in the scene space, i.e., regularity. The paper demonstrates the usefulness of the proposed approach via extensive evaluation and ablation analysis over several benchmark datasets, such as KITTI, NYU Depth V2, and SUN RGB-D. The VA-DepthNet at test time shows considerable improvements in depth prediction accuracy compared to the prior art and is accurate also at high-frequency regions in the scene space. At the time of writing this paper, our method -- labeled as VA-DepthNet, when tested on the KITTI depth-prediction evaluation set benchmarks, shows state-of-the-art results, and is the top-performing published approach.
翻译:我们提出VA-DepthNet,一种用于单张图像深度预测(SIDP)问题的简单、有效且精确的深度神经网络方法。所提出的方法倡导在该问题中使用经典的一阶变分约束。尽管当前最先进的SIDP深度神经网络方法在监督设置下从图像中学习场景深度,但它们往往忽略了刚性场景空间中宝贵的不变性和先验信息(例如场景的规则性)。本文的主要贡献在于揭示了经典且基础稳固的变分约束在SIDP任务神经网络设计中的益处。研究表明,在场景空间中施加一阶变分约束,并结合流行的编码器-解码器网络架构设计,可为监督式SIDP任务提供出色结果。所施加的一阶变分约束使网络能够感知场景空间中的深度梯度(即规则性)。本文通过在多个基准数据集(如KITTI、NYU Depth V2和SUN RGB-D)上的广泛评估和消融分析,展示了所提出方法的有用性。在测试时,VA-DepthNet相较于先前方法在深度预测精度上表现出显著提升,并且在场景空间的高频区域也保持精确。截至本文撰写时,我们的方法(标记为VA-DepthNet)在KITTI深度预测评估集基准测试中显示出最先进的结果,是表现最佳的已发表方法。