We propose a simple three-stage approach to segment unseen objects in RGB images using their CAD models. Leveraging recent powerful foundation models, DINOv2 and Segment Anything, we create descriptors and generate proposals, including binary masks for a given input RGB image. By matching proposals with reference descriptors created from CAD models, we achieve precise object ID assignment along with modal masks. We experimentally demonstrate that our method achieves state-of-the-art results in CAD-based novel object segmentation, surpassing existing approaches on the seven core datasets of the BOP challenge by 19.8% AP using the same BOP evaluation protocol. Our source code is available at https://github.com/nv-nguyen/cnos.
翻译:我们提出了一种简单的三阶段方法,利用物体的CAD模型在RGB图像中分割未见过的物体。借助近期强大的基础模型DINOv2和Segment Anything,我们生成描述符并产生候选区域,包括给定输入RGB图像的二进制掩码。通过将候选区域与从CAD模型生成的参考描述符进行匹配,我们实现了精确的物体ID分配以及模态掩码。实验表明,我们的方法在基于CAD的新颖物体分割中达到了最先进的性能,使用相同的BOP评估协议,在BOP挑战的七个核心数据集上超越现有方法19.8%的AP。我们的源代码可在https://github.com/nv-nguyen/cnos获取。