Multiple Object Tracking (MOT) focuses on modeling the relationship of detected objects among consecutive frames and merge them into different trajectories. MOT remains a challenging task as noisy and confusing detection results often hinder the final performance. Furthermore, most existing research are focusing on improving detection algorithms and association strategies. As such, we propose a novel framework that can effectively predict and mask-out the noisy and confusing detection results before associating the objects into trajectories. In particular, we formulate such "bad" detection results as a sequence of events and adopt the spatio-temporal point process}to model such events. Traditionally, the occurrence rate in a point process is characterized by an explicitly defined intensity function, which depends on the prior knowledge of some specific tasks. Thus, designing a proper model is expensive and time-consuming, with also limited ability to generalize well. To tackle this problem, we adopt the convolutional recurrent neural network (conv-RNN) to instantiate the point process, where its intensity function is automatically modeled by the training data. Furthermore, we show that our method captures both temporal and spatial evolution, which is essential in modeling events for MOT. Experimental results demonstrate notable improvements in addressing noisy and confusing detection results in MOT datasets. An improved state-of-the-art performance is achieved by incorporating our baseline MOT algorithm with the spatio-temporal point process model.
翻译:多目标跟踪(Multiple Object Tracking, MOT)致力于建模连续帧间检测对象之间的关系,并将其合并为不同轨迹。由于噪声和混淆的检测结果常常阻碍最终性能,MOT仍是一项具有挑战性的任务。此外,现有研究大多聚焦于改进检测算法和关联策略。为此,我们提出一种新颖框架,能够在将对象关联成轨迹之前有效预测并屏蔽噪声和混淆的检测结果。具体而言,我们将此类“不良”检测结果形式化为事件序列,并采用时空点过程(spatio-temporal point process)来建模这些事件。传统上,点过程中的发生率由显式定义的强度函数刻画,该函数依赖于特定任务的先验知识。因此,设计合适的模型不仅成本高、耗时,且泛化能力有限。为解决这一问题,我们采用卷积循环神经网络(convolutional recurrent neural network, conv-RNN)来实例化点过程,其强度函数由训练数据自动建模。进一步,我们证明该方法能够捕捉时间与空间的演化,这对MOT事件建模至关重要。实验结果表明,在解决MOT数据集中的噪声和混淆检测结果方面取得了显著改进。通过将我们的基线MOT算法与时空点过程模型相结合,实现了更优的当前最佳性能。