Multi-Object Tracking (MOT) is one of the most fundamental computer vision tasks that contributes to various video analysis applications. Despite the recent promising progress, current MOT research is still limited to a fixed sampling frame rate of the input stream. In fact, we empirically found that the accuracy of all recent state-of-the-art trackers drops dramatically when the input frame rate changes. For a more intelligent tracking solution, we shift the attention of our research work to the problem of Frame Rate Agnostic MOT (FraMOT), which takes frame rate insensitivity into consideration. In this paper, we propose a Frame Rate Agnostic MOT framework with a Periodic training Scheme (FAPS) to tackle the FraMOT problem for the first time. Specifically, we propose a Frame Rate Agnostic Association Module (FAAM) that infers and encodes the frame rate information to aid identity matching across multi-frame-rate inputs, improving the capability of the learned model in handling complex motion-appearance relations in FraMOT. Moreover, the association gap between training and inference is enlarged in FraMOT because those post-processing steps not included in training make a larger difference in lower frame rate scenarios. To address it, we propose Periodic Training Scheme (PTS) to reflect all post-processing steps in training via tracking pattern matching and fusion. Along with the proposed approaches, we make the first attempt to establish an evaluation method for this new task of FraMOT in two different modes, i.e., known frame rate and unknown frame rate, aiming to handle a more complex situation. The quantitative experiments on the challenging MOT17/20 dataset (FraMOT version) have clearly demonstrated that the proposed approaches can handle different frame rates better and thus improve the robustness against complicated scenarios.
翻译:多目标跟踪(MOT)是计算机视觉领域最基础的任务之一,为各类视频分析应用提供支撑。尽管近期取得了令人鼓舞的进展,但当前MOT研究仍局限于输入流的固定采样帧率。事实上,我们通过实验发现,当输入帧率发生变化时,所有最新先进跟踪器的精度均会显著下降。为实现更智能的跟踪方案,我们将研究重点转向帧率无关多目标跟踪(FraMOT)问题,该问题考虑了帧率不敏感性。本文首次提出一种带周期性训练方案的帧率无关MOT框架(FAPS)以解决FraMOT问题。具体而言,我们提出帧率无关关联模块(FAAM),该模块通过推断并编码帧率信息辅助跨多帧率输入的恒等匹配,提升模型处理FraMOT中复杂运动-外观关系的能力。此外,由于未纳入训练的后处理步骤在低帧率场景下差异更大,FraMOT中训练与推理的关联鸿沟被进一步放大。为应对该问题,我们提出周期性训练方案(PTS),通过跟踪模式匹配与融合在训练中反映所有后处理步骤。结合所提方法,我们首次尝试为FraMOT这一新任务建立两种模式下的评估方法,即已知帧率与未知帧率模式,旨在应对更复杂场景。在挑战性MOT17/20数据集(FraMOT版本)上的定量实验清晰表明,所提方法能更好适应不同帧率,从而提升对复杂场景的鲁棒性。