In recent years, deep convolutional neural networks (CNN) have significantly advanced face detection. In particular, lightweight CNNbased architectures have achieved great success due to their lowcomplexity structure facilitating real-time detection tasks. However, current lightweight CNN-based face detectors trading accuracy for efficiency have inadequate capability in handling insufficient feature representation, faces with unbalanced aspect ratios and occlusion. Consequently, they exhibit deteriorated performance far lagging behind the deep heavy detectors. To achieve efficient face detection without sacrificing accuracy, we design an efficient deep face detector termed EfficientFace in this study, which contains three modules for feature enhancement. To begin with, we design a novel cross-scale feature fusion strategy to facilitate bottom-up information propagation, such that fusing low-level and highlevel features is further strengthened. Besides, this is conducive to estimating the locations of faces and enhancing the descriptive power of face features. Secondly, we introduce a Receptive Field Enhancement module to consider faces with various aspect ratios. Thirdly, we add an Attention Mechanism module for improving the representational capability of occluded faces. We have evaluated EfficientFace on four public benchmarks and experimental results demonstrate the appealing performance of our method. In particular, our model respectively achieves 95.1% (Easy), 94.0% (Medium) and 90.1% (Hard) on validation set of WIDER Face dataset, which is competitive with heavyweight models with only 1/15 computational costs of the state-of-the-art MogFace detector.
翻译:近年来,深度卷积神经网络(CNN)显著推动了人脸检测技术的发展。其中,基于轻量级CNN的架构因其低复杂度结构能够实现实时检测任务而取得巨大成功。然而,现有轻量级CNN人脸检测器为了追求效率而牺牲精度,在处理特征表征不足、宽高比不平衡及遮挡人脸时存在能力缺陷,导致其性能显著落后于深度重型检测器。为实现兼顾精度的高效人脸检测,本文设计了名为EfficientFace的高效深度人脸检测器,该网络包含三个特征增强模块。首先,我们提出一种新颖的跨尺度特征融合策略,通过促进自底向上的信息传播,进一步强化低层与高层特征的融合。这不仅有利于人脸位置估计,还能增强人脸特征的描述能力。其次,我们引入感受野增强模块以处理不同宽高比的人脸。最后,添加注意力机制模块提升遮挡人脸的表示能力。在四个公开基准数据集上的评估表明,本方法具有显著性能优势。特别地,我们的模型在WIDER Face验证集上分别达到95.1%(简单)、94.0(中等)和90.1%(困难)的检测精度,与重型检测器竞争的同时计算成本仅为最新MogFace检测器的1/15。