In this paper, we propose a lightweight and accurate face detection algorithm LAFD (Light and accurate face detection) based on Retinaface. Backbone network in the algorithm is a modified MobileNetV3 network which adjusts the size of the convolution kernel, the channel expansion multiplier of the inverted residuals block and the use of the SE attention mechanism. Deformable convolution network(DCN) is introduced in the context module and the algorithm uses focal loss function instead of cross-entropy loss function as the classification loss function of the model. The test results on the WIDERFACE dataset indicate that the average accuracy of LAFD is 94.1%, 92.2% and 82.1% for the "easy", "medium" and "hard" validation subsets respectively with an improvement of 3.4%, 4.0% and 8.3% compared to Retinaface and 3.1%, 4.1% and 4.1% higher than the well-performing lightweight model, LFFD. If the input image is pre-processed and scaled to 1560px in length or 1200px in width, the model achieves an average accuracy of 86.2% on the 'hard' validation subset. The model is lightweight, with a size of only 10.2MB.
翻译:本文提出一种基于Retinaface的轻量级且高精度的人脸检测算法LAFD(轻量且准确的人脸检测)。该算法的主干网络为改进型MobileNetV3网络,通过调整卷积核尺寸、倒残差模块的通道扩展倍数以及SE注意力机制的使用方式实现优化。上下文模块中引入可变形卷积网络(DCN),并使用焦点损失函数替代交叉熵损失函数作为模型的分类损失函数。在WIDERFACE数据集上的测试结果表明,LAFD算法在"简单"、"中等"和"困难"验证子集上的平均精度分别达到94.1%、92.2%和82.1%,相较于Retinaface分别提升3.4%、4.0%和8.3%,相较于性能优异的轻量级模型LFFD分别提升3.1%、4.1%和4.1%。若对输入图像进行预处理并将其缩放至长度1560像素或宽度1200像素,该模型在"困难"验证子集上的平均精度可达86.2%。该模型具有轻量化特性,体积仅为10.2MB。