Feature matching is a crucial technique in computer vision. Essentially, it can be considered as a searching problem to establish correspondences between images. The key challenge in this task lies in the lack of a well-defined search space, leading to inaccurate point matching of current methods. In pursuit of a reasonable matching search space, this paper introduces a hierarchical feature matching framework: Area to Point Matching (A2PM), to first find semantic area matches between images, and then perform point matching on area matches, thus setting the search space as the area matches with salient features to achieve high matching precision. This proper search space of A2PM framework also alleviates the accuracy limitation in state-of-the-art Transformer-based matching methods. To realize this framework, we further propose Semantic and Geometry Area Matching (SGAM) method, which utilizes semantic prior and geometry consistency to establish accurate area matches between images. By integrating SGAM with off-the-shelf Transformer-based matchers, our feature matching methods, adopting the A2PM framework, achieve encouraging precision improvements in massive point matching and pose estimation experiments for present arts.
翻译:特征匹配是计算机视觉中的关键技术。本质上,它可以被视为一种在图像间建立对应关系的搜索问题。该任务的核心挑战在于缺乏精确定义的搜索空间,导致现有方法的点匹配不准确。为了构建合理的匹配搜索空间,本文提出了一种分层特征匹配框架:区域到点匹配(A2PM),该框架首先在图像间寻找语义区域匹配,然后在区域匹配的基础上执行点匹配,从而将搜索空间限定为具有显著特征的区域匹配,以实现高匹配精度。A2PM框架的这种合理搜索空间也缓解了当前基于Transformer的最先进匹配方法在精度上的局限性。为实现该框架,我们进一步提出了语义与几何区域匹配(SGAM)方法,该方法利用语义先验和几何一致性在图像间建立准确的区域匹配。通过将SGAM与现成的基于Transformer的匹配器集成,采用A2PM框架的特征匹配方法在大规模点匹配和姿态估计实验中,相较于现有技术取得了令人鼓舞的精度提升。