Open World Object Detection (OWOD) is a challenging and realistic task that extends beyond the scope of standard Object Detection task. It involves detecting both known and unknown objects while integrating learned knowledge for future tasks. However, the level of "unknownness" varies significantly depending on the context. For example, a tree is typically considered part of the background in a self-driving scene, but it may be significant in a household context. We argue that this contextual information should already be embedded within the known classes. In other words, there should be a semantic or latent structure relationship between the known and unknown items to be discovered. Motivated by this observation, we propose Hyp-OW, a method that learns and models hierarchical representation of known items through a SuperClass Regularizer. Leveraging this representation allows us to effectively detect unknown objects using a similarity distance-based relabeling module. Extensive experiments on benchmark datasets demonstrate the effectiveness of Hyp-OW, achieving improvement in both known and unknown detection (up to 6 percent). These findings are particularly pronounced in our newly designed benchmark, where a strong hierarchical structure exists between known and unknown objects. Our code can be found at https://github.com/boschresearch/Hyp-OW
翻译:开放世界目标检测(OWOD)是一项具有挑战性且现实的任务,其超出了标准目标检测任务的范畴。该任务不仅需要检测已知与未知物体,还需整合已学知识以应用于未来任务。然而,“未知性”的程度会因上下文而显著变化。例如,在自动驾驶场景中,树木通常被视为背景的一部分,但在家庭环境中它可能具有重要意义。我们认为,这种上下文信息应当已嵌入已知类别中。换言之,已知与待发现未知物品之间应存在某种语义或潜在的结构关系。基于这一观察,我们提出Hyp-OW——一种通过超类正则化器学习并建模已知物品层次表示的方法。利用该表示,我们能够通过基于相似度距离的重标注模块有效检测未知物体。在基准数据集上的大量实验证明了Hyp-OW的有效性,在已知与未知物体检测上均实现了提升(最高达6%)。这些发现在我们新设计的基准测试中尤为显著,其中已知与未知物体之间存在强烈的层次结构关系。我们的代码可在https://github.com/boschresearch/Hyp-OW获取。