Object detection in remote sensing images relies on a large amount of labeled data for training. However, the increasing number of new categories and class imbalance make exhaustive annotation impractical. Few-shot object detection (FSOD) addresses this issue by leveraging meta-learning on seen base classes and fine-tuning on novel classes with limited labeled samples. Nonetheless, the substantial scale and orientation variations of objects in remote sensing images pose significant challenges to existing few-shot object detection methods. To overcome these challenges, we propose integrating a feature pyramid network and utilizing prototype features to enhance query features, thereby improving existing FSOD methods. We refer to this modified FSOD approach as a Strong Baseline, which has demonstrated significant performance improvements compared to the original baselines. Furthermore, we tackle the issue of spatial misalignment caused by orientation variations between the query and support images by introducing a Transformation-Invariant Network (TINet). TINet ensures geometric invariance and explicitly aligns the features of the query and support branches, resulting in additional performance gains while maintaining the same inference speed as the Strong Baseline. Extensive experiments on three widely used remote sensing object detection datasets, i.e., NWPU VHR-10.v2, DIOR, and HRRSD demonstrated the effectiveness of the proposed method.
翻译:遥感图像中的目标检测依赖大量标注数据进行训练。然而,新类别数量的不断增加以及类别不平衡问题,使得穷尽式标注变得不切实际。小样本目标检测通过利用基类上的元学习,并在仅含少量标注样本的新类别上进行微调,解决了这一问题。尽管如此,遥感图像中目标在尺度和方向上的巨大变化,给现有小样本目标检测方法带来了显著挑战。为克服这些挑战,我们提出集成特征金字塔网络并利用原型特征增强查询特征,从而改进现有小样本目标检测方法。我们将这种改进的小样本目标检测方法称为强基线,与原始基线相比,该方法的性能得到了显著提升。此外,针对查询图像和支持图像之间因方向变化导致的空间错位问题,我们提出了一种变换不变网络。该网络确保了几何不变性,并显式地对齐查询分支和支持分支的特征,在保持与强基线相同推理速度的同时,实现了额外的性能提升。在三个广泛使用的遥感图像目标检测数据集(即NWPU VHR-10.v2、DIOR和HRRSD)上进行的大量实验,验证了所提方法的有效性。