Rethinking Scale Imbalance in Semi-supervised Object Detection for Aerial Images

This paper focuses on the scale imbalance problem of semi-supervised object detection(SSOD) in aerial images. Compared to natural images, objects in aerial images show smaller sizes and larger quantities per image, increasing the difficulty of manual annotation. Meanwhile, the advanced SSOD technique can train superior detectors by leveraging limited labeled data and massive unlabeled data, saving annotation costs. However, as an understudied task in aerial images, SSOD suffers from a drastic performance drop when facing a large proportion of small objects. By analyzing the predictions between small and large objects, we identify three imbalance issues caused by the scale bias, i.e., pseudo-label imbalance, label assignment imbalance, and negative learning imbalance. To tackle these issues, we propose a novel Scale-discriminative Semi-Supervised Object Detection (S^3OD) learning pipeline for aerial images. In our S^3OD, three key components, Size-aware Adaptive Thresholding (SAT), Size-rebalanced Label Assignment (SLA), and Teacher-guided Negative Learning (TNL), are proposed to warrant scale unbiased learning. Specifically, SAT adaptively selects appropriate thresholds to filter pseudo-labels for objects at different scales. SLA balances positive samples of objects at different scales through resampling and reweighting. TNL alleviates the imbalance in negative samples by leveraging information generated by a teacher model. Extensive experiments conducted on the DOTA-v1.5 benchmark demonstrate the superiority of our proposed methods over state-of-the-art competitors. Codes will be released soon.

翻译：本文聚焦于航空图像半监督目标检测（SSOD）中的尺度不平衡问题。与自然图像相比，航空图像中的目标尺寸更小、每幅图像中的数量更多，这增加了手工标注的难度。同时，先进的SSOD技术能够通过利用有限的标注数据和海量的未标注数据来训练性能优越的检测器，从而节省标注成本。然而，作为航空图像中尚未充分研究的任务，SSOD在面对大量小目标时性能急剧下降。通过分析小目标与大目标的预测结果，我们识别出由尺度偏差引起的三种不平衡问题，即伪标签不平衡、标签分配不平衡和负样本学习不平衡。为解决这些问题，我们提出了一种新颖的尺度判别型半监督目标检测（S^3OD）学习框架用于航空图像。在我们的S^3OD中，提出了三个关键组件：尺寸感知自适应阈值（SAT）、尺寸重平衡标签分配（SLA）和教师引导的负样本学习（TNL），以确保尺度无偏的学习。具体而言，SAT自适应地选择适当阈值以过滤不同尺度目标的伪标签。SLA通过重采样和重权重来平衡不同尺度目标的正样本。TNL通过利用教师模型生成的信息来缓解负样本中的不平衡。在DOTA-v1.5基准上进行的广泛实验表明，我们提出的方法优于当前最先进的竞争者。代码将很快发布。