Over the past few years, monocular depth estimation and completion have been paid more and more attention from the computer vision community because of their widespread applications. In this paper, we introduce novel physics (geometry)-driven deep learning frameworks for these two tasks by assuming that 3D scenes are constituted with piece-wise planes. Instead of directly estimating the depth map or completing the sparse depth map, we propose to estimate the surface normal and plane-to-origin distance maps or complete the sparse surface normal and distance maps as intermediate outputs. To this end, we develop a normal-distance head that outputs pixel-level surface normal and distance. Meanwhile, the surface normal and distance maps are regularized by a developed plane-aware consistency constraint, which are then transformed into depth maps. Furthermore, we integrate an additional depth head to strengthen the robustness of the proposed frameworks. Extensive experiments on the NYU-Depth-v2, KITTI and SUN RGB-D datasets demonstrate that our method exceeds in performance prior state-of-the-art monocular depth estimation and completion competitors. The source code will be available at https://github.com/ShuweiShao/NDDepth.
翻译:近年来,由于广泛的应用场景,单目深度估计与补全在计算机视觉领域受到越来越多的关注。本文通过假设三维场景由分段平面构成,为这两个任务引入了新颖的物理(几何)驱动深度学习框架。我们不直接估计深度图或补全稀疏深度图,而是提出估计表面法向量与平面到原点距离图,或补全稀疏表面法向量与距离图作为中间输出。为此,我们开发了一个法向量-距离头,用于输出像素级的表面法向量和距离。同时,通过设计的平面感知一致性约束对表面法向量和距离图进行正则化,并将其转换为深度图。此外,我们集成了一个额外的深度头以增强所提框架的鲁棒性。在NYU-Depth-v2、KITTI和SUN RGB-D数据集上的大量实验表明,我们的方法在性能上超越了此前最先进的单目深度估计与补全方法。源代码将发布在https://github.com/ShuweiShao/NDDepth。