State-of-the-art stereo matching methods typically use costly 3D convolutions to aggregate a full cost volume, but their computational demands make mobile deployment challenging. Directly applying 2D convolutions for cost aggregation often results in edge blurring, detail loss, and mismatches in textureless regions. Some complex operations, like deformable convolutions and iterative warping, can partially alleviate this issue; however, they are not mobile-friendly, limiting their deployment on mobile devices. In this paper, we present a novel bilateral aggregation network (BANet) for mobile stereo matching that produces high-quality results with sharp edges and fine details using only 2D convolutions. Specifically, we first separate the full cost volume into detailed and smooth volumes using a spatial attention map, then perform detailed and smooth aggregations accordingly, ultimately fusing both to obtain the final disparity map. Additionally, to accurately identify high-frequency detailed regions and low-frequency smooth/textureless regions, we propose a new scale-aware spatial attention module. Experimental results demonstrate that our BANet-2D significantly outperforms other mobile-friendly methods, achieving 35.3\% higher accuracy on the KITTI 2015 leaderboard than MobileStereoNet-2D, with faster runtime on mobile devices. The extended 3D version, BANet-3D, achieves the highest accuracy among all real-time methods on high-end GPUs. Code: \textcolor{magenta}{https://github.com/gangweiX/BANet}.
翻译:当前最先进的立体匹配方法通常采用计算代价高昂的3D卷积来聚合完整的代价体,但其计算需求使得在移动设备上部署具有挑战性。直接应用2D卷积进行代价聚合往往会导致边缘模糊、细节丢失以及在纹理缺失区域的误匹配。一些复杂操作,如可变形卷积和迭代扭曲,可以部分缓解此问题;然而,这些操作对移动设备并不友好,限制了其在移动设备上的部署。本文提出了一种新颖的用于移动端立体匹配的双边聚合网络(BANet),该网络仅使用2D卷积即可生成具有清晰边缘和精细细节的高质量结果。具体而言,我们首先利用空间注意力图将完整代价体分离为细节代价体和平滑代价体,然后分别进行细节聚合与平滑聚合,最终融合两者以获得最终的视差图。此外,为了准确识别高频细节区域和低频平滑/纹理缺失区域,我们提出了一种新的尺度感知空间注意力模块。实验结果表明,我们的BANet-2D显著优于其他适用于移动设备的方法,在KITTI 2015基准测试上的准确率比MobileStereoNet-2D高出35.3%,且在移动设备上运行速度更快。其扩展的3D版本BANet-3D在高端GPU上实现了所有实时方法中最高的准确率。代码:\textcolor{magenta}{https://github.com/gangweiX/BANet}。