Recently, inspired by DETR variants, query-based end-to-end instance segmentation (QEIS) methods have outperformed CNN-based models on large-scale datasets. Yet they would lose efficacy when only a small amount of training data is available since it's hard for the crucial queries/kernels to learn localization and shape priors. To this end, this work offers a novel unsupervised pre-training solution for low-data regimes. Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models by giving Saliency Prompt for queries/kernels. Our method contains three parts: 1) Saliency Masks Proposal is responsible for generating pseudo masks from unlabeled images based on the saliency mechanism. 2) Prompt-Kernel Matching transfers pseudo masks into prompts and injects the corresponding localization and shape priors to the best-matched kernels. 3) Kernel Supervision is applied to supply supervision at the kernel level for robust learning. From a practical perspective, our pre-training method helps QEIS models achieve a similar convergence speed and comparable performance with CNN-based models in low-data regimes. Experimental results show that our method significantly boosts several QEIS models on three datasets. Code will be made available.
翻译:近期,受DETR变体启发,基于查询的端到端实例分割方法在大规模数据集上已超越基于CNN的模型。然而,当仅有少量训练数据可用时,由于关键的查询/核难以学习定位与形状先验,这些方法会丧失有效性。为此,本工作针对低数据场景提出了一种新颖的无监督预训练解决方案。受提示技术近年成功的启发,我们引入了一种通过为查询/核提供显著性提示来增强QEIS模型的新预训练方法。该方法包含三部分:1)显著性掩码提议——基于显著性机制从无标注图像中生成伪掩码;2)提示与核匹配——将伪掩码转化为提示,并将对应的定位与形状先验注入最佳匹配的核中;3)核监督——在核级别提供监督以实现稳健学习。从实践角度看,我们的预训练方法帮助QEIS模型在低数据场景下达到与基于CNN模型相近的收敛速度与性能。实验结果表明,该方法在三个数据集上显著提升了多种QEIS模型的性能。代码将公开。