Self-supervised depth estimation has evolved into an image reconstruction task that minimizes a photometric loss. While recent methods have made strides in indoor depth estimation, they often produce inconsistent depth estimation in textureless areas and unsatisfactory depth discrepancies at object boundaries. To address these issues, in this work, we propose GAM-Depth, developed upon two novel components: gradient-aware mask and semantic constraints. The gradient-aware mask enables adaptive and robust supervision for both key areas and textureless regions by allocating weights based on gradient magnitudes.The incorporation of semantic constraints for indoor self-supervised depth estimation improves depth discrepancies at object boundaries, leveraging a co-optimization network and proxy semantic labels derived from a pretrained segmentation model. Experimental studies on three indoor datasets, including NYUv2, ScanNet, and InteriorNet, show that GAM-Depth outperforms existing methods and achieves state-of-the-art performance, signifying a meaningful step forward in indoor depth estimation. Our code will be available at https://github.com/AnqiCheng1234/GAM-Depth.
翻译:自监督深度估计已发展为通过最小化光度损失实现的图像重建任务。尽管近期方法在室内深度估计中取得进展,但在纹理缺失区域常产生不一致的深度估计,且物体边界处存在令人不满意的深度差异。为解决这些问题,本文提出GAM-Depth方法,其包含两个创新组件:梯度感知掩码与语义约束。梯度感知掩码通过基于梯度幅值分配权重,实现对关键区域与纹理缺失区域的适应性鲁棒监督。在室内自监督深度估计中引入语义约束,利用联合优化网络及源自预训练分割模型的代理语义标签,改善了物体边界处的深度差异。在NYUv2、ScanNet、InteriorNet三个室内数据集上的实验表明,GAM-Depth优于现有方法并达到最优性能,标志着室内深度估计领域的重要进展。我们的代码将发布于https://github.com/AnqiCheng1234/GAM-Depth。