This work explores the YOLOv6 object detection model in depth, concentrating on its design framework, optimization techniques, and detection capabilities. YOLOv6's core elements consist of the EfficientRep Backbone for robust feature extraction and the Rep-PAN Neck for seamless feature aggregation, ensuring high-performance object detection. Evaluated on the COCO dataset, YOLOv6-N achieves 37.5\% AP at 1187 FPS on an NVIDIA Tesla T4 GPU. YOLOv6-S reaches 45.0\% AP at 484 FPS, outperforming models like PPYOLOE-S, YOLOv5-S, YOLOX-S, and YOLOv8-S in the same class. Moreover, YOLOv6-M and YOLOv6-L also show better accuracy (50.0\% and 52.8\%) while maintaining comparable inference speeds to other detectors. With an upgraded backbone and neck structure, YOLOv6-L6 delivers cutting-edge accuracy in real-time.
翻译:本研究深入探讨YOLOv6目标检测模型,重点关注其设计框架、优化技术和检测能力。YOLOv6的核心组件包括用于鲁棒特征提取的EfficientRep主干网络和用于无缝特征聚合的Rep-PAN颈部结构,确保高性能目标检测。在COCO数据集上的评估显示,YOLOv6-N在NVIDIA Tesla T4 GPU上以1187 FPS的速度达到37.5% AP。YOLOv6-S以484 FPS的速度达到45.0% AP,在同类别模型中优于PPYOLOE-S、YOLOv5-S、YOLOX-S和YOLOv8-S。此外,YOLOv6-M和YOLOv6-L在保持与其他检测器相当推理速度的同时,也展现出更优的精度(分别为50.0%和52.8%)。通过升级的主干网络和颈部结构,YOLOv6-L6实现了实时检测中的尖端精度。