Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in videos. Most methods can be roughly classified as tracking-by-detection and joint-detection-association paradigms. Although the latter has elicited more attention and demonstrates comparable performance relative than the former, we claim that the tracking-by-detection paradigm is still the optimal solution in terms of tracking accuracy,such as ByteTrack,which achieves 80.3 MOTA, 77.3 IDF1 and 63.1 HOTA on the test set of MOT17 with 30 FPS running speed on a single V100 GPU.However, under complex perspectives such as vehicle and UAV acceleration, the performance of such a tracker using uniform Kalman filter will be greatly affected, resulting in tracking loss.In this paper, we propose a variable speed Kalman filter algorithm based on environmental feedback and improve the matching process, which can greatly improve the tracking effect in complex variable speed scenes while maintaining high tracking accuracy in relatively static scenes. Eventually, higher MOTA and IDF1 results can be achieved on MOT17 test set than ByteTrack
翻译:多目标跟踪(MOT)旨在估计视频中目标的边界框和身份信息。大多数方法可大致分为基于检测的跟踪与联合检测-关联两类范式。尽管后者引起了更多关注并展现出与前者相当的性能,但我们主张基于检测的跟踪范式在跟踪精度方面仍是更优方案,例如ByteTrack,在MOT17测试集上以单块V100 GPU上30 FPS的运行速度实现了80.3 MOTA、77.3 IDF1和63.1 HOTA。然而,在车辆和无人机加速等复杂场景下,采用统一卡尔曼滤波器的此类跟踪器性能会受到显著影响,导致跟踪丢失。本文提出一种基于环境反馈的可变速度卡尔曼滤波器算法,并改进了匹配过程,能够在保持相对静态场景下高跟踪精度的同时,大幅提升复杂变速场景下的跟踪效果。最终,在MOT17测试集上获得了比ByteTrack更高的MOTA和IDF1结果。