Multiple object tracking (MOT) has been successfully investigated in computer vision. However, MOT for the videos captured by unmanned aerial vehicles (UAV) is still challenging due to small object size, blurred object appearance, and very large and/or irregular motion in both ground objects and UAV platforms. In this paper, we propose FOLT to mitigate these problems and reach fast and accurate MOT in UAV view. Aiming at speed-accuracy trade-off, FOLT adopts a modern detector and light-weight optical flow extractor to extract object detection features and motion features at a minimum cost. Given the extracted flow, the flow-guided feature augmentation is designed to augment the object detection feature based on its optical flow, which improves the detection of small objects. Then the flow-guided motion prediction is also proposed to predict the object's position in the next frame, which improves the tracking performance of objects with very large displacements between adjacent frames. Finally, the tracker matches the detected objects and predicted objects using a spatially matching scheme to generate tracks for every object. Experiments on Visdrone and UAVDT datasets show that our proposed model can successfully track small objects with large and irregular motion and outperform existing state-of-the-art methods in UAV-MOT tasks.
翻译:摘要:多目标跟踪(MOT)已在计算机视觉领域得到成功研究。然而,由于无人机捕获视频中目标尺寸小、外观模糊,且地面目标和无人机平台均存在大幅或非规则运动,导致无人机视频多目标跟踪仍具挑战性。本文提出FOLT方法以缓解上述问题,实现无人机视角下快速精准的多目标跟踪。为平衡速度与精度,FOLT采用现代检测器与轻量级光流提取器,以最小代价提取目标检测特征与运动特征。基于所提取光流,设计光流引导特征增强模块,通过光流信息增强目标检测特征,从而提升小目标检测性能。同时提出光流引导运动预测方法,预测目标在下一帧的位置,改善相邻帧间大幅位移目标的跟踪效果。最终,跟踪器通过空间匹配方案关联检测目标与预测目标,为每个目标生成完整轨迹。在Visdrone与UAVDT数据集上的实验表明,本文模型能够成功跟踪具有大幅非规则运动的小目标,并在无人机多目标跟踪任务中超越现有最先进方法。