Face parsing infers a pixel-wise label map for each semantic facial component. Previous methods generally work well for uncovered faces, however overlook the facial occlusion and ignore some contextual area outside a single face, especially when facial occlusion has become a common situation during the COVID-19 epidemic. Inspired by the illumination theory of image, we propose a novel homogeneous tanh-transforms for image preprocessing, which made up of four tanh-transforms, that fuse the central vision and the peripheral vision together. Our proposed method addresses the dilemma of face parsing under occlusion and compresses more information of surrounding context. Based on homogeneous tanh-transforms, we propose an occlusion-aware convolutional neural network for occluded face parsing. It combines the information both in Tanh-polar space and Tanh-Cartesian space, capable of enhancing receptive fields. Furthermore, we introduce an occlusion-aware loss to focus on the boundaries of occluded regions. The network is simple and flexible, and can be trained end-to-end. To facilitate future research of occluded face parsing, we also contribute a new cleaned face parsing dataset, which is manually purified from several academic or industrial datasets, including CelebAMask-HQ, Short-video Face Parsing as well as Helen dataset and will make it public. Experiments demonstrate that our method surpasses state-of-art methods of face parsing under occlusion.
翻译:人脸解析旨在推断每个语义面部组件的像素级标签图。先前的方法通常对未遮挡人脸效果良好,但忽略了面部遮挡以及单人脸区域外的上下文信息,尤其是在新冠疫情流行期间面部遮挡已成为常见情况。受图像光照理论启发,我们提出一种新颖的齐次双曲正切变换用于图像预处理,该变换由四个双曲正切变换组成,融合了中央视觉与周边视觉。所提方法解决了遮挡下人脸解析的困境,并压缩了更多周围上下文信息。基于齐次双曲正切变换,我们进一步提出遮挡感知卷积神经网络用于遮挡人脸解析。该网络结合了双曲正切极坐标空间与双曲正切笛卡尔空间的信息,能够扩大感受野。此外,我们引入遮挡感知损失函数以聚焦遮挡区域边界。该网络结构简单灵活,可端到端训练。为促进遮挡人脸解析的未来研究,我们还贡献了一个经人工清洗的人脸解析数据集(从多个学术及工业数据集如CelebAMask-HQ、短视频人脸解析及Helen数据集中纯化得到),并将公开该数据集。实验表明,我们的方法在遮挡人脸解析任务上超越了现有最优方法。