Data association is at the core of many computer vision tasks, e.g., multiple object tracking, image matching, and point cloud registration. however, current data association solutions have some defects: they mostly ignore the intra-view context information; besides, they either train deep association models in an end-to-end way and hardly utilize the advantage of optimization-based assignment methods, or only use an off-the-shelf neural network to extract features. In this paper, we propose a general learnable graph matching method to address these issues. Especially, we model the intra-view relationships as an undirected graph. Then data association turns into a general graph matching problem between graphs. Furthermore, to make optimization end-to-end differentiable, we relax the original graph matching problem into continuous quadratic programming and then incorporate training into a deep graph neural network with KKT conditions and implicit function theorem. In MOT task, our method achieves state-of-the-art performance on several MOT datasets. For image matching, our method outperforms state-of-the-art methods on a popular indoor dataset, ScanNet. For point cloud registration, we also achieve competitive results. Code will be available at https://github.com/jiaweihe1996/GMTracker.
翻译:数据关联是许多计算机视觉任务(例如多目标跟踪、图像匹配和点云配准)的核心。然而,当前的数据关联解决方案存在一些缺陷:它们大多忽略了视图内的上下文信息;此外,它们要么以端到端的方式训练深度关联模型但难以利用基于优化的分配方法的优势,要么仅使用现成的神经网络来提取特征。本文提出了一种通用的可学习图匹配方法来解决这些问题。具体而言,我们将视图内关系建模为一个无向图。然后,数据关联转化为图之间的通用图匹配问题。进一步地,为了使优化过程端到端可微分,我们将原始图匹配问题松弛为连续二次规划,并通过KKT条件和隐函数定理将其训练融入深度图神经网络中。在多目标跟踪任务中,我们的方法在多个数据集上取得了最先进的性能。对于图像匹配,我们的方法在流行的室内数据集ScanNet上优于现有最先进方法。在点云配准方面,我们也取得了具有竞争力的结果。代码将在https://github.com/jiaweihe1996/GMTracker开源。