Data augmentation has shown significant advancements in computer vision to improve model performance over the years, particularly in scenarios with limited and insufficient data. Currently, most studies focus on adjusting the image or its features to expand the size, quality, and variety of samples during training in various tasks including object detection. However, we argue that it is necessary to investigate bounding box transformations as a data augmentation technique rather than image-level transformations, especially in aerial imagery due to potentially inconsistent bounding box annotations. Hence, this letter presents a thorough investigation of bounding box transformation in terms of scaling, rotation, and translation for remote sensing object detection. We call this augmentation strategy NBBOX (Noise Injection into Bounding Box). We conduct extensive experiments on DOTA and DIOR-R, both well-known datasets that include a variety of rotated generic objects in aerial images. Experimental results show that our approach significantly improves remote sensing object detection without whistles and bells and it is more time-efficient than other state-of-the-art augmentation strategies.
翻译:数据增强技术近年来在计算机视觉领域展现出显著进展,能够有效提升模型性能,尤其在数据有限或不足的场景中。当前大多数研究集中于通过调整图像或其特征来扩展训练样本的规模、质量与多样性,这一方法在包括目标检测在内的多种任务中得到广泛应用。然而,我们认为有必要将边界框变换作为一种数据增强技术进行深入研究,而非仅关注图像层面的变换——这一需求在航空影像中尤为突出,因为此类图像中的边界框标注可能存在不一致性。为此,本文系统探究了边界框在缩放、旋转和平移三个维度上的变换策略对遥感目标检测的影响。我们将这种增强策略命名为NBBOX(边界框噪声注入)。我们在DOTA和DIOR-R两个知名数据集上进行了大量实验,这两个数据集均包含航空影像中多种旋转角度的通用目标。实验结果表明,我们的方法能够显著提升遥感目标检测性能,且无需复杂修饰,其时间效率也优于其他先进的增强策略。