In the classical tracking-by-detection (TBD) paradigm, detection and tracking are separately and sequentially conducted, and data association must be properly performed to achieve satisfactory tracking performance. In this paper, a new end-to-end multi-object tracking framework is proposed, which integrates object detection and multi-object tracking into a single model. The proposed tracking framework eliminates the complex data association process in the classical TBD paradigm, and requires no additional training. Secondly, the regression confidence of historical trajectories is investigated, and the possible states of a trajectory (weak object or strong object) in the current frame are predicted. Then, a confidence fusion module is designed to guide non-maximum suppression for trajectories and detections to achieve ordered and robust tracking. Thirdly, by integrating historical trajectory features, the regression performance of the detector is enhanced, which better reflects the occlusion and disappearance patterns of objects in real world. Lastly, extensive experiments are conducted on the commonly used KITTI and Waymo datasets. The results show that the proposed framework can achieve robust tracking by using only a 2D detector and a 3D detector, and it is proven more accurate than many of the state-of-the-art TBD-based multi-modal tracking methods. The source codes of the proposed method are available at https://github.com/wangxiyang2022/YONTD-MOT.
翻译:在经典跟踪-检测(TBD)范式中,检测与跟踪过程被分离并顺序执行,且必须妥善执行数据关联才能获得满意的跟踪性能。本文提出一种新的端到端多目标跟踪框架,将目标检测与多目标跟踪整合到单一模型中。该跟踪框架摒弃了经典TBD范式中复杂的数据关联过程,且无需额外训练。其次,本文研究了历史轨迹的回归置信度,并预测了轨迹在当前帧中的可能状态(弱目标或强目标)。随后,设计置信度融合模块以引导轨迹与检测的非极大值抑制,实现有序且鲁棒的跟踪。第三,通过整合历史轨迹特征,增强检测器的回归性能,更好地反映真实世界中目标的遮挡与消失模式。最后,在常用的KITTI和Waymo数据集上进行了大量实验。结果表明,该框架仅使用2D检测器和3D检测器即可实现鲁棒跟踪,且被证明比许多基于TBD的先进多模态跟踪方法更为精确。所提方法的源代码可在https://github.com/wangxiyang2022/YONTD-MOT获取。