Self-supervised pre-training, based on the pretext task of instance discrimination, has fueled the recent advance in label-efficient object detection. However, existing studies focus on pre-training only a feature extractor network to learn transferable representations for downstream detection tasks. This leads to the necessity of training multiple detection-specific modules from scratch in the fine-tuning phase. We argue that the region proposal network (RPN), a common detection-specific module, can additionally be pre-trained towards reducing the localization error of multi-stage detectors. In this work, we propose a simple pretext task that provides an effective pre-training for the RPN, towards efficiently improving downstream object detection performance. We evaluate the efficacy of our approach on benchmark object detection tasks and additional downstream tasks, including instance segmentation and few-shot detection. In comparison with multi-stage detectors without RPN pre-training, our approach is able to consistently improve downstream task performance, with largest gains found in label-scarce settings.
翻译:基于实例判别代理任务的自监督预训练推动了标签高效目标检测的最新进展。然而,现有研究仅关注预训练特征提取器网络以学习适用于下游检测任务的可迁移表征,导致在微调阶段需要从头训练多个检测专用模块。我们认为区域提议网络(RPN)作为一种常见的检测专用模块,可通过预训练来减少多阶段检测器的定位误差。本文提出一种简单的代理任务,为RPN提供有效的预训练,从而高效提升下游目标检测性能。我们在标准目标检测任务及包括实例分割与少样本检测在内的附加下游任务上验证了方法的有效性。与未进行RPN预训练的多阶段检测器相比,我们的方法能够持续提升下游任务性能,在标签稀缺场景下提升效果最为显著。