Zero-shot instance segmentation aims to detect and precisely segment objects of unseen categories without any training samples. Since the model is trained on seen categories, there is a strong bias that the model tends to classify all the objects into seen categories. Besides, there is a natural confusion between background and novel objects that have never shown up in training. These two challenges make novel objects hard to be raised in the final instance segmentation results. It is desired to rescue novel objects from background and dominated seen categories. To this end, we propose D$^2$Zero with Semantic-Promoted Debiasing and Background Disambiguation to enhance the performance of Zero-shot instance segmentation. Semantic-promoted debiasing utilizes inter-class semantic relationships to involve unseen categories in visual feature training and learns an input-conditional classifier to conduct dynamical classification based on the input image. Background disambiguation produces image-adaptive background representation to avoid mistaking novel objects for background. Extensive experiments show that we significantly outperform previous state-of-the-art methods by a large margin, e.g., 16.86% improvement on COCO. Project page: https://henghuiding.github.io/D2Zero/
翻译:零样本实例分割旨在无需任何训练样本的情况下,检测并精确分割未见类别的物体。由于模型仅在可见类别上训练,会产生强烈偏见,导致模型倾向于将所有物体归类为可见类别。此外,训练中从未出现的背景与新物体之间存在天然混淆。这两个挑战使得新物体难以在最终实例分割结果中被识别出来。因此,亟需将新物体从背景及占主导地位的可见类别中解救出来。为此,我们提出D$^2$Zero模型,通过语义强化去偏与背景消歧技术提升零样本实例分割性能。语义强化去偏利用类间语义关系使未见类别参与视觉特征训练,并学习基于输入图像进行动态分类的输入条件分类器;背景消歧则生成图像自适应背景表示,避免将新物体误判为背景。大量实验表明,我们的方法显著超越现有最优方法,例如在COCO数据集上实现16.86%的性能提升。项目页面:https://henghuiding.github.io/D2Zero/