Feature matching is a crucial technique in computer vision. A unified perspective for this task is to treat it as a searching problem, aiming at an efficient search strategy to narrow the search space to point matches between images. One of the key aspects of search strategy is the search space, which in current approaches is not carefully defined, resulting in limited matching accuracy. This paper, thus, pays attention to the search space and proposes to set the initial search space for point matching as the matched image areas containing prominent semantic, named semantic area matches. This search space favors point matching by salient features and alleviates the accuracy limitation in recent Transformer-based matching methods. To achieve this search space, we introduce a hierarchical feature matching framework: Area to Point Matching (A2PM), to first find semantic area matches between images and later perform point matching on area matches. We further propose Semantic and Geometry Area Matching (SGAM) method to realize this framework, which utilizes semantic prior and geometry consistency to establish accurate area matches between images. By integrating SGAM with off-the-shelf state-of-the-art matchers, our method, adopting the A2PM framework, achieves encouraging precision improvements in massive point matching and pose estimation experiments.
翻译:特征匹配是计算机视觉中的关键技术。该任务的统一视角可视为搜索问题,旨在通过高效搜索策略将搜索空间缩小至图像间的点匹配。搜索策略的关键要素之一是搜索空间,而现有方法对此并未精确定义,导致匹配精度受限。本文关注搜索空间问题,提出将包含显著语义的匹配图像区域(称为语义区域匹配)作为点匹配的初始搜索空间。该搜索空间利用显著特征优化点匹配过程,并缓解了当前基于Transformer的匹配方法中的精度限制。为实现该搜索空间,我们引入分层特征匹配框架——区域到点匹配(A2PM),该方法首先寻找图像间的语义区域匹配,随后在区域匹配基础上执行点匹配。我们进一步提出语义与几何区域匹配(SGAM)方法以落地该框架,其利用语义先验与几何一致性建立图像间精确的区域匹配。通过将SGAM与现有最先进匹配器集成,采用A2PM框架的方法在大规模点匹配与位姿估计实验中实现了令人鼓舞的精度提升。