Many multi-object tracking (MOT) approaches, which employ the Kalman Filter as a motion predictor, assume constant velocity and Gaussian-distributed filtering noises. These assumptions render the Kalman Filter-based trackers effective in linear motion scenarios. However, these linear assumptions serve as a key limitation when estimating future object locations within scenarios involving non-linear motion and occlusions. To address this issue, we propose a motion-based MOT approach with an adaptable motion predictor, called AM-SORT, which adapts to estimate non-linear uncertainties. AM-SORT is a novel extension of the SORT-series trackers that supersedes the Kalman Filter with the transformer architecture as a motion predictor. We introduce a historical trajectory embedding that empowers the transformer to extract spatio-temporal features from a sequence of bounding boxes. AM-SORT achieves competitive performance compared to state-of-the-art trackers on DanceTrack, with 56.3 IDF1 and 55.6 HOTA. We conduct extensive experiments to demonstrate the effectiveness of our method in predicting non-linear movement under occlusions.
翻译:许多采用卡尔曼滤波作为运动预测器的多目标跟踪方法,假设速度恒定且滤波噪声服从高斯分布。这些假设使基于卡尔曼滤波的跟踪器在线性运动场景中表现有效。然而,在涉及非线性运动和遮挡的场景中估算未来目标位置时,这些线性假设成为关键限制。为解决此问题,我们提出一种基于运动的自适应运动预测器MOT方法,称为AM-SORT,其能自适应地估计非线性不确定性。AM-SORT是SORT系列跟踪器的新型扩展,用Transformer架构替代卡尔曼滤波作为运动预测器。我们引入历史轨迹嵌入,使Transformer能够从边界框序列中提取时空特征。在DanceTrack数据集上,AM-SORT以56.3 IDF1和55.6 HOTA的指标达到与现有最先进跟踪器相竞争的性能。我们通过大量实验证明了该方法在遮挡条件下预测非线性运动的有效性。