Despite the remarkable progress facilitated by learning-based stereo-matching algorithms, disparity estimation in low-texture, occluded, and bordered regions still remains a bottleneck that limits the performance. To tackle these challenges, geometric guidance like plane information is necessary as it provides intuitive guidance about disparity consistency and affinity similarity. In this paper, we propose a normal incorporated joint learning framework consisting of two specific modules named non-local disparity propagation(NDP) and affinity-aware residual learning(ARL). The estimated normal map is first utilized for calculating a non-local affinity matrix and a non-local offset to perform spatial propagation at the disparity level. To enhance geometric consistency, especially in low-texture regions, the estimated normal map is then leveraged to calculate a local affinity matrix, providing the residual learning with information about where the correction should refer and thus improving the residual learning efficiency. Extensive experiments on several public datasets including Scene Flow, KITTI 2015, and Middlebury 2014 validate the effectiveness of our proposed method. By the time we finished this work, our approach ranked 1st for stereo matching across foreground pixels on the KITTI 2015 dataset and 3rd on the Scene Flow dataset among all the published works.
翻译:尽管基于学习的立体匹配算法已取得显著进展,但在低纹理、遮挡和边缘区域中的视差估计仍是制约性能的瓶颈。为应对这些挑战,需要引入平面信息等几何先验,因其能直观提供视差一致性与亲和相似性的指导。本文提出一种融合法向量信息的联合学习框架,包含两个专用模块:非局部视差传播(NDP)与亲和感知残差学习(ARL)。首先利用估计的法向量图构建非局部亲和矩阵和非局部偏移量,以实现视差层面的空间传播。为增强几何一致性(尤其在低纹理区域),进一步利用法向量图构建局部亲和矩阵,为残差学习提供修正位置的参考信息,从而提升残差学习效率。在Scene Flow、KITTI 2015和Middlebury 2014等多个公开数据集上的大量实验验证了所提方法的有效性。截至完成本研究时,我们的方法在KITTI 2015数据集的前景像素立体匹配任务中排名第一,在Scene Flow数据集的已发表论文中排名第三。