Current semi-supervised object detection (SSOD) algorithms typically assume class balanced datasets (PASCAL VOC etc.) or slightly class imbalanced datasets (MS-COCO, etc). This assumption can be easily violated since real world datasets can be extremely class imbalanced in nature, thus making the performance of semi-supervised object detectors far from satisfactory. Besides, the research for this problem in SSOD is severely under-explored. To bridge this research gap, we comprehensively study the class imbalance problem for SSOD under more challenging scenarios, thus forming the first experimental setting for class imbalanced SSOD (CI-SSOD). Moreover, we propose a simple yet effective gradient-based sampling framework that tackles the class imbalance problem from the perspective of two types of confirmation biases. To tackle confirmation bias towards majority classes, the gradient-based reweighting and gradient-based thresholding modules leverage the gradients from each class to fully balance the influence of the majority and minority classes. To tackle the confirmation bias from incorrect pseudo labels of minority classes, the class-rebalancing sampling module resamples unlabeled data following the guidance of the gradient-based reweighting module. Experiments on three proposed sub-tasks, namely MS-COCO, MS-COCO to Object365 and LVIS, suggest that our method outperforms current class imbalanced object detectors by clear margins, serving as a baseline for future research in CI-SSOD. Code will be available at https://github.com/nightkeepers/CI-SSOD.
翻译:当前半监督目标检测算法通常假设数据集类别平衡或轻度不平衡。这一假设在现实场景中极易被违反,因真实世界数据集常呈现极端类别不平衡特性,导致半监督目标检测器性能远未达到理想水平。此外,该问题在半监督目标检测领域的研究严重不足。为弥合这一研究空白,我们系统研究了更具挑战性场景下的半监督目标检测类别不平衡问题,首次构建了类别不平衡半监督目标检测实验范式。同时,我们提出了一种简洁高效的基于梯度的采样框架,从两类确认偏差角度解决类别不平衡问题。针对多数类确认偏差,梯度重加权模块与梯度阈值化模块通过利用各类别梯度充分平衡多数类与少数类的影响力。针对少数类错误伪标签引发的确认偏差,类别重平衡采样模块遵循梯度重加权模块的指引对无标签数据重采样。在MS-COCO、MS-COCO至Object365迁移及LVIS三个子任务上的实验表明,本方法显著超越现有类别不平衡目标检测器,可作为该领域未来研究的基准。相关代码开源于https://github.com/nightkeepers/CI-SSOD。