Detecting and tracking multiple unmanned aerial vehicles (UAVs) in thermal infrared video is inherently challenging due to low contrast, environmental noise, and small target sizes. This paper provides a straightforward approach to address multi-UAV tracking in thermal infrared video, leveraging recent advances in detection and tracking. Instead of relying on the YOLOv5 with the DeepSORT pipeline, we present a tracking framework built on YOLOv12 and BoT-SORT, enhanced with tailored training and inference strategies. We evaluate our approach following the metrics from the 4th Anti-UAV Challenge and demonstrate competitive performance. Notably, we achieve strong results without using contrast enhancement or temporal information fusion to enrich UAV features, highlighting our approach as a "Strong Baseline" for the multi-UAV tracking task. We provide implementation details, in-depth experimental analysis, and a discussion of potential improvements. The code is available at https://github.com/wish44165/YOLOv12-BoT-SORT-ReID .
翻译:在热红外视频中检测与跟踪多架无人机本质上面临着低对比度、环境噪声及目标尺寸小等固有挑战。本文提出了一种基于检测与跟踪最新进展的简洁方法,以应对热红外视频中的多无人机跟踪问题。我们并未沿用基于YOLOv5与DeepSORT的流程,而是构建了一个以YOLOv12和BoT-SORT为核心、辅以针对性训练与推理策略增强的跟踪框架。我们依据第四届反无人机挑战赛的指标评估了所提方法,并展示了具有竞争力的性能。值得注意的是,我们未采用对比度增强或时序信息融合来丰富无人机特征即取得了优异结果,这凸显了本方法作为多无人机跟踪任务“强基线”的价值。文中提供了实现细节、深入的实验分析以及潜在改进方向的探讨。代码已发布于 https://github.com/wish44165/YOLOv12-BoT-SORT-ReID。