Mind the Gap: Polishing Pseudo labels for Accurate Semi-supervised Object Detection

Exploiting pseudo labels (e.g., categories and bounding boxes) of unannotated objects produced by a teacher detector have underpinned much of recent progress in semi-supervised object detection (SSOD). However, due to the limited generalization capacity of the teacher detector caused by the scarce annotations, the produced pseudo labels often deviate from ground truth, especially those with relatively low classification confidences, thus limiting the generalization performance of SSOD. To mitigate this problem, we propose a dual pseudo-label polishing framework for SSOD. Instead of directly exploiting the pseudo labels produced by the teacher detector, we take the first attempt at reducing their deviation from ground truth using dual polishing learning, where two differently structured polishing networks are elaborately developed and trained using synthesized paired pseudo labels and the corresponding ground truth for categories and bounding boxes on the given annotated objects, respectively. By doing this, both polishing networks can infer more accurate pseudo labels for unannotated objects through sufficiently exploiting their context knowledge based on the initially produced pseudo labels, and thus improve the generalization performance of SSOD. Moreover, such a scheme can be seamlessly plugged into the existing SSOD framework for joint end-to-end learning. In addition, we propose to disentangle the polished pseudo categories and bounding boxes of unannotated objects for separate category classification and bounding box regression in SSOD, which enables introducing more unannotated objects during model training and thus further improve the performance. Experiments on both PASCAL VOC and MS COCO benchmarks demonstrate the superiority of the proposed method over existing state-of-the-art baselines.

翻译：利用教师检测器生成的未标注物体伪标签（如类别和边界框）是近期半监督目标检测（SSOD）进展的关键。然而，由于标注稀缺导致教师检测器泛化能力有限，生成的伪标签常偏离真实值，尤其是那些分类置信度较低的标签，从而限制了SSOD的泛化性能。为解决这一问题，我们提出了一种用于SSOD的双重伪标签优化框架。我们不再直接使用教师检测器生成的伪标签，而是首次尝试通过双重优化学习减少其与真实值的偏差：针对给定标注物体，分别使用合成的配对伪标签及对应的类别和边界框真实值，精心设计并训练两个结构不同的优化网络。通过此方法，两个优化网络均能基于初始生成的伪标签，充分挖掘上下文知识，推算出更准确的未标注物体伪标签，从而提升SSOD的泛化性能。此外，该方案可无缝嵌入现有SSOD框架，实现端到端联合学习。同时，我们提出在SSOD中解耦未标注物体的优化后伪类别与边界框，分别进行类别分类和边界框回归，从而在模型训练中引入更多未标注物体，进一步提升性能。在PASCAL VOC和MS COCO基准上的实验表明，所提方法优于现有最先进基线。