Multiple toddler tracking (MTT) involves identifying and differentiating toddlers in video footage. While conventional multi-object tracking (MOT) algorithms are adept at tracking diverse objects, toddlers pose unique challenges due to their unpredictable movements, various poses, and similar appearance. Tracking toddlers in indoor environments introduces additional complexities such as occlusions and limited fields of view. In this paper, we address the challenges of MTT and propose MTTSort, a customized method built upon the DeepSort algorithm. MTTSort is designed to track multiple toddlers in indoor videos accurately. Our contributions include discussing the primary challenges in MTT, introducing a genetic algorithm to optimize hyperparameters, proposing an accurate tracking algorithm, and curating the MTTrack dataset using unbiased AI co-labeling techniques. We quantitatively compare MTTSort to state-of-the-art MOT methods on MTTrack, DanceTrack, and MOT15 datasets. In our evaluation, the proposed method outperformed other MOT methods, achieving 0.98, 0.68, and 0.98 in multiple object tracking accuracy (MOTA), higher order tracking accuracy (HOTA), and iterative and discriminative framework 1 (IDF1) metrics, respectively.
翻译:多名幼儿追踪(MTT)涉及在视频画面中识别和区分幼儿。虽然传统的多目标追踪(MOT)算法擅长追踪各类物体,但幼儿因其不可预测的行为、多样化的姿态以及相似的外貌而带来独特挑战。在室内环境中追踪幼儿会进一步增加复杂性,例如遮挡问题和视野受限。本文针对MTT的挑战,提出了一种基于DeepSort算法定制的MTTSort方法。该方法旨在准确追踪室内视频中的多名幼儿。我们的贡献包括:讨论MTT面临的主要挑战、引入遗传算法优化超参数、提出精确的追踪算法,以及通过无偏AI协同标注技术构建MTTrack数据集。我们在MTTrack、DanceTrack和MOT15数据集上将MTTSort与最先进的MOT方法进行了定量比较。评估结果显示,所提方法在多目标追踪准确率(MOTA)、高阶追踪准确率(HOTA)以及迭代判别框架1(IDF1)指标上分别达到0.98、0.68和0.98,其性能优于其他MOT方法。