Local feature matching aims at establishing sparse correspondences between a pair of images. Recently, detector-free methods present generally better performance but are not satisfactory in image pairs with large scale differences. In this paper, we propose Patch Area Transportation with Subdivision (PATS) to tackle this issue. Instead of building an expensive image pyramid, we start by splitting the original image pair into equal-sized patches and gradually resizing and subdividing them into smaller patches with the same scale. However, estimating scale differences between these patches is non-trivial since the scale differences are determined by both relative camera poses and scene structures, and thus spatially varying over image pairs. Moreover, it is hard to obtain the ground truth for real scenes. To this end, we propose patch area transportation, which enables learning scale differences in a self-supervised manner. In contrast to bipartite graph matching, which only handles one-to-one matching, our patch area transportation can deal with many-to-many relationships. PATS improves both matching accuracy and coverage, and shows superior performance in downstream tasks, such as relative pose estimation, visual localization, and optical flow estimation. The source code is available at \url{https://zju3dv.github.io/pats/}.
翻译:局部特征匹配旨在建立一对图像之间的稀疏对应关系。近年来,无检测器方法通常表现出更优的性能,但在尺度差异较大的图像对上表现不佳。本文提出基于细分的块区域传输方法(PATS)以解决该问题。不同于构建昂贵的图像金字塔,我们首先将原始图像对分割为等尺寸的块,逐步调整其尺寸并细分为相同尺度的更小块。然而,估计这些块之间的尺度差异具有挑战性,因为尺度差异由相对相机姿态与场景结构共同决定,从而在图像对上表现出空间变化特性。此外,真实场景中难以获取尺度差异的基准真值。为此,我们提出块区域传输方法,能够以自监督方式学习尺度差异。与仅处理一对一匹配的二分图匹配不同,我们的块区域传输可处理多对多关系。PATS在提升匹配精度与覆盖范围的同时,在相对位姿估计、视觉定位及光流估计等下游任务中展现出优越性能。源代码已开源至 \url{https://zju3dv.github.io/pats/}。