Fine-grained semantic segmentation of a person's face and head, including facial parts and head components, has progressed a great deal in recent years. However, it remains a challenging task, whereby considering ambiguous occlusions and large pose variations are particularly difficult. To overcome these difficulties, we propose a novel framework termed Mask-FPAN. It uses a de-occlusion module that learns to parse occluded faces in a semi-supervised way. In particular, face landmark localization, face occlusionstimations, and detected head poses are taken into account. A 3D morphable face model combined with the UV GAN improves the robustness of 2D face parsing. In addition, we introduce two new datasets named FaceOccMask-HQ and CelebAMaskOcc-HQ for face paring work. The proposed Mask-FPAN framework addresses the face parsing problem in the wild and shows significant performance improvements with MIOU from 0.7353 to 0.9013 compared to the state-of-the-art on challenging face datasets.
翻译:近年来,针对人脸及头部(包括面部部件与头部组件)的细粒度语义分割已取得显著进展。然而,该任务仍具挑战性,尤其在处理模糊遮挡与大幅度姿态变化时更为困难。为克服这些难题,我们提出一种名为Mask-FPAN的新型框架。该框架采用去遮挡模块,以半监督方式学习解析被遮挡人脸。具体而言,该方法综合考量了面部关键点定位、面部遮挡估计以及头部姿态检测。结合UV GAN的三维可变形人脸模型显著提升了二维人脸解析的鲁棒性。此外,我们提出了两个新数据集FaceOccMask-HQ与CelebAMaskOcc-HQ以支撑人脸解析工作。所提出的Mask-FPAN框架有效解决了野外场景下的人脸解析问题,在多个具有挑战性的人脸数据集上,其MIOU指标从0.7353提升至0.9013,较现有最优方法实现了显著性能突破。