In real-world applications where confidence is key, like autonomous driving, the accurate detection and appropriate handling of classes differing from those used during training are crucial. Despite the proposal of various unknown object detection approaches, we have observed widespread inconsistencies among them regarding the datasets, metrics, and scenarios used, alongside a notable absence of a clear definition for unknown objects, which hampers meaningful evaluation. To counter these issues, we introduce two benchmarks: a unified VOC-COCO evaluation, and the new OpenImagesRoad benchmark which provides clear hierarchical object definition besides new evaluation metrics. Complementing the benchmark, we exploit recent self-supervised Vision Transformers performance, to improve pseudo-labeling-based OpenSet Object Detection (OSOD), through OW-DETR++. State-of-the-art methods are extensively evaluated on the proposed benchmarks. This study provides a clear problem definition, ensures consistent evaluations, and draws new conclusions about effectiveness of OSOD strategies.
翻译:在自动驾驶等对置信度要求极高的实际应用中,准确检测并妥善处理与训练类别不同的对象至关重要。尽管已有多种未知目标检测方法被提出,我们观察到这些方法在数据集、评估指标和应用场景方面存在普遍的不一致性,同时缺乏对未知对象的明确定义,这阻碍了有意义的性能评估。为解决这些问题,我们引入了两个基准测试:统一的VOC-COCO评估框架,以及新的OpenImagesRoad基准——该基准不仅提供清晰的分层对象定义,还引入了新的评估指标。作为基准的补充,我们利用近期自监督视觉Transformer的性能优势,通过OW-DETR++改进了基于伪标签的开放集目标检测方法。我们在所提出的基准上对现有最先进方法进行了全面评估。本研究提供了清晰的问题定义,确保了评估的一致性,并对开放集目标检测策略的有效性得出了新的结论。