Data augmentation has shown significant advancements in computer vision to improve model performance over the years, particularly in scenarios with limited and insufficient data. Currently, most studies focus on adjusting the image or its features to expand the size, quality, and variety of samples during training in various tasks including object detection. However, we argue that it is necessary to investigate bounding box transformations as a data augmentation technique rather than image-level transformations, especially in aerial imagery due to potentially inconsistent bounding box annotations. Hence, this letter presents a thorough investigation of bounding box transformation in terms of scaling, rotation, and translation for remote sensing object detection. We call this augmentation strategy NBBOX (Noise Injection into Bounding Box). We conduct extensive experiments on DOTA and DIOR-R, both well-known datasets that include a variety of rotated generic objects in aerial images. Experimental results show that our approach significantly improves remote sensing object detection without whistles and bells and it is more time-efficient than other state-of-the-art augmentation strategies.
翻译:数据增强技术近年来在计算机视觉领域展现出显著进步,能够有效提升模型性能,尤其在数据有限或不足的场景中。当前大多数研究集中于通过调整图像或其特征来扩展训练样本的规模、质量与多样性,这一方法在目标检测等多种任务中得到广泛应用。然而,我们认为有必要将边界框变换作为一种数据增强技术进行探究,而非仅局限于图像层面的变换,这在航空影像中尤为重要,因为其边界框标注可能存在不一致性。为此,本文针对遥感目标检测任务,系统研究了边界框的缩放、旋转和平移变换。我们将这种增强策略命名为NBBOX(边界框噪声注入)。我们在DOTA和DIOR-R这两个包含航空图像中多种旋转通用目标的知名数据集上进行了大量实验。实验结果表明,我们的方法能够显著提升遥感目标检测性能,且无需复杂技巧,其时间效率也优于其他先进的增强策略。