Face Super-Resolution (FSR) aims to recover high-resolution (HR) face images from low-resolution (LR) ones. Despite the progress made by convolutional neural networks in FSR, the results of existing approaches are not ideal due to their low reconstruction efficiency and insufficient utilization of prior information. Considering that faces are highly structured objects, effectively leveraging facial priors to improve FSR results is a worthwhile endeavor. This paper proposes a novel network architecture called W-Net to address this challenge. W-Net leverages meticulously designed Parsing Block to fully exploit the resolution potential of LR image. We use this parsing map as an attention prior, effectively integrating information from both the parsing map and LR images. Simultaneously, we perform multiple fusions in various dimensions through the W-shaped network structure combined with the LPF(LR-Parsing Map Fusion Module). Additionally, we utilize a facial parsing graph as a mask, assigning different weights and loss functions to key facial areas to balance the performance of our reconstructed facial images between perceptual quality and pixel accuracy. We conducted extensive comparative experiments, not only limited to conventional facial super-resolution metrics but also extending to downstream tasks such as facial recognition and facial keypoint detection. The experiments demonstrate that W-Net exhibits outstanding performance in quantitative metrics, visual quality, and downstream tasks.
翻译:人脸超分辨率(FSR)旨在从低分辨率(LR)图像中恢复出高分辨率(HR)人脸图像。尽管卷积神经网络在FSR领域取得了进展,但现有方法由于重建效率低且先验信息利用不足,其结果并不理想。考虑到人脸是高度结构化的对象,有效利用面部先验信息来改善FSR结果是一项值得探索的工作。本文提出了一种名为W-Net的新型网络架构以应对这一挑战。W-Net利用精心设计的解析块充分挖掘LR图像的分辨率潜力。我们将此解析图用作注意力先验,有效整合了来自解析图和LR图像的信息。同时,我们通过W形网络结构结合LR-解析图融合模块(LPF),在多个维度上执行多次融合。此外,我们利用面部解析图作为掩码,为关键面部区域分配不同的权重和损失函数,以平衡重建人脸图像在感知质量与像素精度之间的性能。我们进行了广泛的对比实验,不仅限于传统的人脸超分辨率指标,还延伸至人脸识别和人脸关键点检测等下游任务。实验表明,W-Net在定量指标、视觉质量和下游任务中均表现出色。