We propose a simple three-stage approach to segment unseen objects in RGB images using their CAD models. Leveraging recent powerful foundation models, DINOv2 and Segment Anything, we create descriptors and generate proposals, including binary masks for a given input RGB image. By matching proposals with reference descriptors created from CAD models, we achieve precise object ID assignment along with modal masks. We experimentally demonstrate that our method achieves state-of-the-art results in CAD-based novel object segmentation, surpassing existing approaches on the seven core datasets of the BOP challenge by 19.8\% AP using the same BOP evaluation protocol. Our source code is available at https://github.com/nv-nguyen/cnos.
翻译:我们提出一种简单的三阶段方法,用于利用CAD模型在RGB图像中分割未见过的物体。借助当前强大的基础模型DINOv2和Segment Anything,我们创建描述符并生成候选区域,包括针对给定输入RGB图像的二进制掩码。通过将候选区域与由CAD模型生成的参考描述符进行匹配,我们实现了精确的物体ID分配与模态掩码。实验证明,我们的方法在基于CAD的新物体分割任务中达到了最先进水平,在BOP挑战赛的七个核心数据集上,使用相同的BOP评估协议,平均精度(AP)超越现有方法19.8%。我们的源代码已公开于https://github.com/nv-nguyen/cnos。