Multi-object tracking in traffic videos is a crucial research area, offering immense potential for enhancing traffic monitoring accuracy and promoting road safety measures through the utilisation of advanced machine learning algorithms. However, existing datasets for multi-object tracking in traffic videos often feature limited instances or focus on single classes, which cannot well simulate the challenges encountered in complex traffic scenarios. To address this gap, we introduce TrafficMOT, an extensive dataset designed to encompass diverse traffic situations with complex scenarios. To validate the complexity and challenges presented by TrafficMOT, we conducted comprehensive empirical studies using three different settings: fully-supervised, semi-supervised, and a recent powerful zero-shot foundation model Tracking Anything Model (TAM). The experimental results highlight the inherent complexity of this dataset, emphasising its value in driving advancements in the field of traffic monitoring and multi-object tracking.
翻译:交通视频中的多目标跟踪是一个关键研究领域,通过利用先进的机器学习算法,在提升交通监控精度和促进道路安全措施方面具有巨大潜力。然而,现有用于交通视频多目标跟踪的数据集往往实例数量有限或仅关注单一类别,难以充分模拟复杂交通场景中面临的挑战。为填补这一空白,我们引入了TrafficMOT,这是一个旨在涵盖复杂场景下多样化交通状况的大规模数据集。为验证TrafficMOT的复杂性与挑战性,我们采用三种不同设置开展了全面的实证研究:全监督、半监督以及近期强大的零样本基础模型——追踪任意模型(Tracking Anything Model, TAM)。实验结果凸显了该数据集的内在复杂性,强调了其在推动交通监控与多目标跟踪领域进步方面的重要价值。