Fine-grained semantic segmentation of a person's face and head, including facial parts and head components, has progressed a great deal in recent years. However, it remains a challenging task, whereby considering ambiguous occlusions and large pose variations are particularly difficult. To overcome these difficulties, we propose a novel framework termed Mask-FPAN. It uses a de-occlusion module that learns to parse occluded faces in a semi-supervised way. In particular, face landmark localization, face occlusionstimations, and detected head poses are taken into account. A 3D morphable face model combined with the UV GAN improves the robustness of 2D face parsing. In addition, we introduce two new datasets named FaceOccMask-HQ and CelebAMaskOcc-HQ for face paring work. The proposed Mask-FPAN framework addresses the face parsing problem in the wild and shows significant performance improvements with MIOU from 0.7353 to 0.9013 compared to the state-of-the-art on challenging face datasets.
翻译:近年来,细粒度的面部与头部语义分割(包括面部部件及头部组件)取得了显著进展。然而,该任务仍面临挑战,尤其是处理模糊遮挡和大幅度姿态变化时尤为困难。为克服这些难点,我们提出了一种名为Mask-FPAN的新型框架。该框架采用去遮挡模块,以半监督方式学习解析被遮挡的人脸。具体而言,该方法综合考量了面部关键点定位、人脸遮挡估计以及检测到的头部姿态。结合3D可形变人脸模型与UV GAN,提升了二维人脸解析的鲁棒性。此外,我们引入了两个新数据集FaceOccMask-HQ与CelebAMaskOcc-HQ,用于人脸解析任务。所提出的Mask-FPAN框架有效解决了野外环境下的面部解析问题,在具有挑战性的人脸数据集上,相较于现有最优方法,平均交并比(MIOU)从0.7353提升至0.9013,性能显著提升。