Open-world instance segmentation has recently gained significant popularitydue to its importance in many real-world applications, such as autonomous driving, robot perception, and remote sensing. However, previous methods have either produced unsatisfactory results or relied on complex systems and paradigms. We wonder if there is a simple way to obtain state-of-the-art results. Fortunately, we have identified two observations that help us achieve the best of both worlds: 1) query-based methods demonstrate superiority over dense proposal-based methods in open-world instance segmentation, and 2) learning localization cues is sufficient for open world instance segmentation. Based on these observations, we propose a simple query-based method named OpenInst for open world instance segmentation. OpenInst leverages advanced query-based methods like QueryInst and focuses on learning localization cues. Notably, OpenInst is an extremely simple and straightforward framework without any auxiliary modules or post-processing, yet achieves state-of-the-art results on multiple benchmarks. Specifically, in the COCO$\to$UVO scenario, OpenInst achieves a mask AR of 53.3, outperforming the previous best methods by 2.0 AR with a simpler structure. We hope that OpenInst can serve as a solid baselines for future research in this area.
翻译:开放世界实例分割因其在自动驾驶、机器人感知和遥感等众多实际应用中的重要性,近年来获得了广泛关注。然而,先前的方法要么结果不理想,要么依赖复杂的系统和范式。我们不禁思考是否存在一种简单方法也能获得最先进的成果。幸运的是,我们通过两个观察结果找到了兼顾两者的途径:1)在开放世界实例分割中,基于查询的方法优于基于密集提案的方法;2)学习定位线索足以应对开放世界实例分割。基于这些观察,我们提出了一种名为OpenInst的简单查询基方法。OpenInst借鉴了QueryInst等先进查询方法,并专注于定位线索的学习。值得注意的是,OpenInst框架极为简洁直观,无需任何辅助模块或后处理,却在多个基准测试中实现了最先进的性能。具体而言,在COCO→UVO场景下,OpenInst取得了53.3的掩膜平均召回率,以更简洁的结构将此前最优方法提升了2.0个AR指标。我们希望OpenInst能成为该领域未来研究的可靠基线。