In the facial expression recognition task, researchers always get low accuracy of expression classification due to a small amount of training samples. In order to solve this kind of problem, we proposes a new data augmentation method named MixCut. In this method, we firstly interpolate the two original training samples at the pixel level in a random ratio to generate new samples. Then, pixel removal is performed in random square regions on the new samples to generate the final training samples. We evaluated the MixCut method on Fer2013Plus and RAF-DB. With MixCut, we achieved 85.63% accuracy in eight-label classification on Fer2013Plus and 87.88% accuracy in seven-label classification on RAF-DB, effectively improving the classification accuracy of facial expression image recognition. Meanwhile, on Fer2013Plus, MixCut achieved performance improvements of +0.59%, +0.36%, and +0.39% compared to the other three data augmentation methods: CutOut, Mixup, and CutMix, respectively. MixCut improves classification accuracy on RAF-DB by +0.22%, +0.65%, and +0.5% over these three data augmentation methods.
翻译:在面部表情识别任务中,由于训练样本数量较少,研究者通常难以获得较高的表情分类准确率。为了解决这一问题,我们提出了一种名为MixCut的新型数据增强方法。该方法首先以随机比例在像素级别对两个原始训练样本进行插值,生成新样本;随后,在新样本的随机方形区域内执行像素去除操作,生成最终的训练样本。我们在Fer2013Plus和RAF-DB数据集上对MixCut方法进行了评估。采用MixCut方法,我们在Fer2013Plus的八标签分类中达到了85.63%的准确率,在RAF-DB的七标签分类中达到了87.88%的准确率,有效提升了面部表情图像识别的分类精度。同时,在Fer2013Plus上,相较于CutOut、Mixup和CutMix这三种数据增强方法,MixCut分别实现了+0.59%、+0.36%和+0.39%的性能提升。在RAF-DB上,MixCut相较于这三种数据增强方法,分类准确率分别提升了+0.22%、+0.65%和+0.5%。