In stereo-matching knowledge distillation methods of the self-supervised monocular depth estimation, the stereo-matching network's knowledge is distilled into a monocular depth network through pseudo-depth maps. In these methods, the learning-based stereo-confidence network is generally utilized to identify errors in the pseudo-depth maps to prevent transferring the errors. However, the learning-based stereo-confidence networks should be trained with ground truth (GT), which is not feasible in a self-supervised setting. In this paper, we propose a method to identify and filter errors in the pseudo-depth map using multiple disparity maps by checking their consistency without the need for GT and a training process. Experimental results show that the proposed method outperforms the previous methods and works well on various configurations by filtering out erroneous areas where the stereo-matching is vulnerable, especially such as textureless regions, occlusion boundaries, and reflective surfaces.
翻译:在自监督单目深度估计的立体匹配知识蒸馏方法中,立体匹配网络的知识通过伪深度图蒸馏至单目深度网络。此类方法通常利用基于学习的立体置信度网络识别伪深度图中的误差,以避免传递错误信息。然而,基于学习的立体置信度网络需依赖真实标签进行训练,这在自监督场景中并不可行。本文提出一种无需真实标签及训练过程的方法,通过检查多个视差图的一致性来识别并过滤伪深度图中的误差。实验结果表明,所提方法通过过滤立体匹配易出错的区域(尤其是无纹理区域、遮挡边界及反射表面),性能优于现有方法,且能适配多种配置。