Detecting and tracking multiple unmanned aerial vehicles (UAVs) in thermal infrared video is inherently challenging due to low contrast, environmental noise, and small target sizes. This paper provides a straightforward approach to address multi-UAV tracking in thermal infrared video, leveraging recent advances in detection and tracking. Instead of relying on the well-established YOLOv5 with DeepSORT combination, we present a tracking framework built on YOLOv12 and BoT-SORT, enhanced with tailored training and inference strategies. We evaluate our approach following the 4th Anti-UAV Challenge metrics and reach competitive performance. Notably, we achieved strong results without using contrast enhancement or temporal information fusion to enrich UAV features, highlighting our approach as a "Strong Baseline" for multi-UAV tracking tasks. We provide implementation details, in-depth experimental analysis, and a discussion of potential improvements. The code is available at https://github.com/wish44165/YOLOv12-BoT-SORT-ReID .
翻译:在热红外视频中检测与跟踪多架无人机本质上具有挑战性,这主要源于低对比度、环境噪声以及目标尺寸较小等问题。本文提出了一种直接的方法来解决热红外视频中的多无人机跟踪问题,该方法利用了检测与跟踪领域的最新进展。我们并未依赖已广泛应用的YOLOv5与DeepSORT组合,而是提出了一个基于YOLOv12和BoT-SORT构建的跟踪框架,并通过定制化的训练与推理策略进行了增强。我们依据第四届反无人机挑战赛的评估指标对本方法进行了测试,并取得了具有竞争力的性能。值得注意的是,我们在未使用对比度增强或时序信息融合来丰富无人机特征的情况下,仍取得了优异的结果,这凸显了我们的方法可作为多无人机跟踪任务的一个“强基线”。我们提供了实现细节、深入的实验分析以及对潜在改进方向的讨论。相关代码可在 https://github.com/wish44165/YOLOv12-BoT-SORT-ReID 获取。