Human-annotated label noise and their impact on ConvNets for remote sensing image scene classification

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Convolutional neural networks (ConvNets) have been successfully applied to satellite image scene classification. Human-labeled training datasets are essential for ConvNets to perform accurate classification. Errors in human-annotated training datasets are unavoidable due to the complexity of satellite images. However, the distribution of real-world human-annotated label noises on remote sensing images and their impact on ConvNets have not been investigated. To fill this research gap, this study, for the first time, collected real-world labels from 32 participants and explored how their annotated label noise affect three representative ConvNets (VGG16, GoogleNet, and ResNet-50) for remote sensing image scene classification. We found that: (1) human-annotated label noise exhibits significant class and instance dependence; (2) an additional 1% of human-annotated label noise in training data leads to 0.5% reduction in the overall accuracy of ConvNets classification; (3) the error pattern of ConvNet predictions was strongly correlated with that of participant's labels. To uncover the mechanism underlying the impact of human labeling errors on ConvNets, we further compared it with three types of simulated label noise: uniform noise, class-dependent noise and instance-dependent noise. Our results show that the impact of human-annotated label noise on ConvNets significantly differs from all three types of simulated label noise, while both class dependence and instance dependence contribute to the impact of human-annotated label noise on ConvNets. These observations necessitate a reevaluation of the handling of noisy labels, and we anticipate that our real-world label noise dataset would facilitate the future development and assessment of label-noise learning algorithms.

翻译：卷积神经网络（ConvNets）已成功应用于卫星图像场景分类。人类标注的训练数据集是ConvNets实现精准分类的关键。由于卫星图像的复杂性，人类标注训练数据集中的错误难以避免。然而，遥感图像中真实世界人类标注标签噪声的分布及其对ConvNets的影响尚未得到研究。为填补这一研究空白，本研究首次收集了32名参与者的真实标注数据，探讨其标注噪声如何影响三种代表性ConvNets（VGG16、GoogleNet和ResNet-50）在遥感图像场景分类中的表现。我们发现：（1）人类标注标签噪声呈现出显著的类别依赖性和实例依赖性；（2）训练数据中每增加1%的人类标注标签噪声，会导致ConvNets分类总体准确率下降0.5%；（3）ConvNet预测的错误模式与参与者标注的错误模式高度相关。为揭示人类标注错误对ConvNets影响的机制，我们进一步将其与三种模拟标签噪声（均匀噪声、类别依赖性噪声和实例依赖性噪声）进行了比较。结果表明，人类标注标签噪声对ConvNets的影响与所有三种模拟标签噪声存在显著差异，且类别依赖性和实例依赖性共同促成了人类标注标签噪声对ConvNets的影响。这些发现亟需重新审视噪声标签的处理方法，我们预期本研究的真实标注噪声数据集将促进标签噪声学习算法的未来发展与评估。