Although there are many datasets for traffic sign classification, there are few datasets collected for traffic sign recognition and few of them obtain enough instances especially for training a model with the deep learning method. The deep learning method is almost the only way to train a model for real-world usage that covers various highly similar classes compared with the traditional way such as through color, shape, etc. Also, for some certain sign classes, their sign meanings were destined to can't get enough instances in the dataset. To solve this problem, we purpose a unique data augmentation method for the traffic sign recognition dataset that takes advantage of the standard of the traffic sign. We called it TSR dataset augmentation. We based on the benchmark Tsinghua-Tencent 100K (TT100K) dataset to verify the unique data augmentation method. we performed the method on four main iteration version datasets based on the TT100K dataset and the experimental results showed our method is efficacious. The iteration version datasets based on TT100K, data augmentation method source code and the training results introduced in this paper are publicly available.
翻译:尽管存在许多用于交通标志分类的数据集,但针对交通标志识别任务收集的数据集却较少,且其中获得足够实例数量的数据集更少,尤其是用于基于深度学习方法训练模型时。深度学习方法几乎是唯一能够训练出适用于实际场景的模型的方法,该模型需覆盖与通过颜色、形状等传统方式相比具有高度相似性的多个类别。此外,对于某些特定标志类别,其标志含义决定了其在数据集中难以获得足够的实例。为解决这一问题,我们提出了一种针对交通标志识别数据集的独特数据增强方法,该方法利用了交通标志的标准特性。我们称之为TSR数据增强。我们基于基准数据集Tsinghua-Tencent 100K(TT100K)来验证这种独特的数据增强方法。我们对基于TT100K数据集的四个主要迭代版本数据集实施了该方法,实验结果表明我们的方法是有效的。本文介绍的基于TT100K的迭代版本数据集、数据增强方法源代码及训练结果均公开可用。