The past few years have witnessed the immense success of object detection, while current excellent detectors struggle on tackling size-limited instances. Concretely, the well-known challenge of low overlaps between the priors and object regions leads to a constrained sample pool for optimization, and the paucity of discriminative information further aggravates the recognition. To alleviate the aforementioned issues, we propose CFINet, a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning. Firstly, we introduce Coarse-to-fine RPN (CRPN) to ensure sufficient and high-quality proposals for small objects through the dynamic anchor selection strategy and cascade regression. Then, we equip the conventional detection head with a Feature Imitation (FI) branch to facilitate the region representations of size-limited instances that perplex the model in an imitation manner. Moreover, an auxiliary imitation loss following supervised contrastive learning paradigm is devised to optimize this branch. When integrated with Faster RCNN, CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A, underscoring its superiority over baseline detector and other mainstream detection approaches.
翻译:近年来,目标检测取得了巨大成功,但当前优秀的检测器在处理尺寸受限实例时仍面临挑战。具体而言,先验框与目标区域之间重叠率低这一公认难题导致优化样本池受限,而判别性信息的匮乏进一步加剧了识别困难。为缓解上述问题,我们提出CFINet——一种基于粗到细流水线和特征模仿学习的两阶段小目标检测框架。首先,我们设计粗到细区域提议网络(CRPN),通过动态锚点选择策略与级联回归确保为小目标提供充足且高质量的提议。随后,我们在常规检测头中引入特征模仿分支,通过模仿学习方式增强尺寸受限实例的区域表征能力以解决模型困惑。此外,我们基于监督对比学习范式设计了辅助模仿损失函数来优化该分支。当集成至Faster RCNN后,CFINet在大规模小目标检测基准SODA-D和SODA-A上取得了最优性能,充分证明了其相较于基线检测器及其他主流检测方法的优越性。