This paper presents a novel object detector called DEYOv2, an improved version of the first-generation DEYO (DETR with YOLO) model. DEYOv2, similar to its predecessor, DEYOv2 employs a progressive reasoning approach to accelerate model training and enhance performance. The study delves into the limitations of one-to-one matching in optimization and proposes solutions to effectively address the issue, such as Rank Feature and Greedy Matching. This approach enables the third stage of DEYOv2 to maximize information acquisition from the first and second stages without needing NMS, achieving end-to-end optimization. By combining dense queries, sparse queries, one-to-many matching, and one-to-one matching, DEYOv2 leverages the advantages of each method. It outperforms all existing query-based end-to-end detectors under the same settings. When using ResNet-50 as the backbone and multi-scale features on the COCO dataset, DEYOv2 achieves 51.1 AP and 51.8 AP in 12 and 24 epochs, respectively. Compared to the end-to-end model DINO, DEYOv2 provides significant performance gains of 2.1 AP and 1.4 AP in the two epoch settings. To the best of our knowledge, DEYOv2 is the first fully end-to-end object detector that combines the respective strengths of classical detectors and query-based detectors.
翻译:本文提出一种名为DEYOv2的新型目标检测器,它是第一代DEYO(DETR with YOLO)模型的改进版本。与前辈相似,DEYOv2采用渐进式推理方法加速模型训练并提升性能。本研究深入探讨了一对一匹配在优化中的局限性,并提出有效解决方案,如特征排序(Rank Feature)和贪心匹配(Greedy Matching)。该方案使DEYOv2的第三阶段无需非极大值抑制,即可从第一、二阶段最大程度获取信息,实现端到端优化。通过结合密集查询、稀疏查询、一对多匹配与一对一匹配,DEYOv2充分发挥各类方法的优势。在相同设置下,它优于所有现有的基于查询的端到端检测器。以ResNet-50为骨干网络、在COCO数据集上使用多尺度特征时,DEYOv2在12轮和24轮训练分别达到51.1 AP和51.8 AP。相较于端到端模型DINO,DEYOv2在这两种训练轮数设定下分别实现2.1 AP和1.4 AP的显著性能提升。据我们所知,DEYOv2是首个融合经典检测器与基于查询检测器各自优势的完全端到端目标检测器。