Multi-Camera Multi-Object Tracking (MC-MOT) utilizes information from multiple views to better handle problems with occlusion and crowded scenes. Recently, the use of graph-based approaches to solve tracking problems has become very popular. However, many current graph-based methods do not effectively utilize information regarding spatial and temporal consistency. Instead, they rely on single-camera trackers as input, which are prone to fragmentation and ID switch errors. In this paper, we propose a novel reconfigurable graph model that first associates all detected objects across cameras spatially before reconfiguring it into a temporal graph for Temporal Association. This two-stage association approach enables us to extract robust spatial and temporal-aware features and address the problem with fragmented tracklets. Furthermore, our model is designed for online tracking, making it suitable for real-world applications. Experimental results show that the proposed graph model is able to extract more discriminating features for object tracking, and our model achieves state-of-the-art performance on several public datasets.
翻译:多摄像头多目标跟踪(MC-MOT)利用多视角信息,能更好地处理遮挡和拥挤场景中的问题。近年来,基于图的跟踪方法已变得非常流行。然而,许多现有的图方法未能有效利用时空一致性信息,反而依赖于易出现碎片化和身份切换错误的单摄像头跟踪器作为输入。本文提出了一种新颖的可重构图模型,该模型首先将所有检测到的目标在空间上跨摄像头关联,然后将其重构为时序图以进行时间关联。这种两阶段关联方法使我们能够提取鲁棒的时空感知特征,并解决碎片化轨迹片段的问题。此外,我们的模型专为在线跟踪设计,使其适用于实际应用场景。实验结果表明,所提出的图模型能够为目标跟踪提取更具判别性的特征,并在多个公开数据集上达到了最先进的性能。