Understanding the impact of data set design on model training and performance can help alleviate the costs associated with generating remote sensing and overhead labeled data. This work examined the impact of training shifted window transformers using bounding boxes and segmentation labels, where the latter are more expensive to produce. We examined classification tasks by comparing models trained with both target and backgrounds against models trained with only target pixels, extracted by segmentation labels. For object detection models, we compared performance using either label type when training. We found that the models trained on only target pixels do not show performance improvement for classification tasks, appearing to conflate background pixels in the evaluation set with target pixels. For object detection, we found that models trained with either label type showed equivalent performance across testing. We found that bounding boxes appeared to be sufficient for tasks that did not require more complex labels, such as object segmentation. Continuing work to determine consistency of this result across data types and model architectures could potentially result in substantial savings in generating remote sensing data sets for deep learning.
翻译:理解数据集设计对模型训练和性能的影响,有助于降低生成遥感及俯视标注数据的成本。本研究探讨了使用边界框与分割标签(后者生成成本更高)训练移位窗口Transformer的影响。通过将同时使用目标和背景训练的模型与仅使用分割标签提取的目标像素训练的模型进行对比,我们分析了分类任务。对于目标检测模型,我们比较了使用不同标签类型训练时的性能表现。研究发现,仅使用目标像素训练的模型在分类任务中未展现出性能提升,反而将评估集中的背景像素与目标像素混淆。在目标检测方面,使用两种标签类型训练的模型在测试中性能相当。结果表明,对于不需要复杂标签(如目标分割)的任务,边界框已足够适用。后续研究若能验证该结论在不同数据类型和模型架构中的一致性,或将显著降低深度学习遥感数据集生成成本。