Multiple object tracking (MOT) has been successfully investigated in computer vision. However, MOT for the videos captured by unmanned aerial vehicles (UAV) is still challenging due to small object size, blurred object appearance, and very large and/or irregular motion in both ground objects and UAV platforms. In this paper, we propose FOLT to mitigate these problems and reach fast and accurate MOT in UAV view. Aiming at speed-accuracy trade-off, FOLT adopts a modern detector and light-weight optical flow extractor to extract object detection features and motion features at a minimum cost. Given the extracted flow, the flow-guided feature augmentation is designed to augment the object detection feature based on its optical flow, which improves the detection of small objects. Then the flow-guided motion prediction is also proposed to predict the object's position in the next frame, which improves the tracking performance of objects with very large displacements between adjacent frames. Finally, the tracker matches the detected objects and predicted objects using a spatially matching scheme to generate tracks for every object. Experiments on Visdrone and UAVDT datasets show that our proposed model can successfully track small objects with large and irregular motion and outperform existing state-of-the-art methods in UAV-MOT tasks.
翻译:多目标跟踪(MOT)在计算机视觉领域已取得成功研究。然而,由于目标尺寸小、外观模糊,以及地面目标和无人机平台均存在极大且/或非规则运动,针对无人机拍摄视频的多目标跟踪仍具挑战性。本文提出FOLT以缓解上述问题,实现无人机视角下的快速精准多目标跟踪。为平衡速度与精度,FOLT采用现代检测器与轻量级光流提取器,以最小成本提取目标检测特征与运动特征。基于提取的光流,设计光流引导的特征增强方法,通过光流信息增强目标检测特征,从而提升小目标检测性能。随后提出光流引导的运动预测方法,预测目标在下一帧中的位置,进而提升相邻帧间存在极大位移目标的跟踪性能。最终,跟踪器通过空间匹配方案将检测目标与预测目标关联,为每个目标生成轨迹。在Visdrone和UAVDT数据集上的实验表明,所提模型能成功跟踪具有大幅非规则运动的小目标,并在无人机多目标跟踪任务中超越现有最先进方法。