In stereo-matching knowledge distillation methods of the self-supervised monocular depth estimation, the stereo-matching network's knowledge is distilled into a monocular depth network through pseudo-depth maps. In these methods, the learning-based stereo-confidence network is generally utilized to identify errors in the pseudo-depth maps to prevent transferring the errors. However, the learning-based stereo-confidence networks should be trained with ground truth (GT), which is not feasible in a self-supervised setting. In this paper, we propose a method to identify and filter errors in the pseudo-depth map using multiple disparity maps by checking their consistency without the need for GT and a training process. Experimental results show that the proposed method outperforms the previous methods and works well on various configurations by filtering out erroneous areas where the stereo-matching is vulnerable, especially such as textureless regions, occlusion boundaries, and reflective surfaces.
翻译:在自监督单目深度估计的立体匹配知识蒸馏方法中,立体匹配网络的知识通过伪深度图蒸馏至单目深度网络。现有方法通常利用基于学习的立体置信度网络来识别伪深度图中的错误,以防止错误迁移。然而,基于学习的立体置信度网络需要利用真实标注进行训练,这在自监督场景中难以实现。本文提出一种方法,通过检查多视差图的一致性来识别并过滤伪深度图中的错误,无需真实标注和训练过程。实验结果表明,所提方法能够有效滤除立体匹配易出错的区域(尤其是无纹理区域、遮挡边界及反射表面),在多种配置下均优于现有方法并展现出良好性能。