Multiple complex degradations are coupled in low-quality video faces in the real world. Therefore, blind video face restoration is a highly challenging ill-posed problem, requiring not only hallucinating high-fidelity details but also enhancing temporal coherence across diverse pose variations. Restoring each frame independently in a naive manner inevitably introduces temporal incoherence and artifacts from pose changes and keypoint localization errors. To address this, we propose the first blind video face restoration approach with a novel parsing-guided temporal-coherent transformer (PGTFormer) without pre-alignment. PGTFormer leverages semantic parsing guidance to select optimal face priors for generating temporally coherent artifact-free results. Specifically, we pre-train a temporal-spatial vector quantized auto-encoder on high-quality video face datasets to extract expressive context-rich priors. Then, the temporal parse-guided codebook predictor (TPCP) restores faces in different poses based on face parsing context cues without performing face pre-alignment. This strategy reduces artifacts and mitigates jitter caused by cumulative errors from face pre-alignment. Finally, the temporal fidelity regulator (TFR) enhances fidelity through temporal feature interaction and improves video temporal consistency. Extensive experiments on face videos show that our method outperforms previous face restoration baselines. The code will be released on \href{https://github.com/kepengxu/PGTFormer}{https://github.com/kepengxu/PGTFormer}.
翻译:真实世界中低质量视频人脸常耦合多种复杂退化。因此,盲视频人脸修复是一个高度病态难题,不仅需要幻化高保真细节,还需增强不同姿态变化下的时序一致性。以朴素方式独立修复每帧会不可避免地引入姿态变化和关键点定位误差导致的时序不一致与伪影。为解决此问题,我们提出首个无需预对齐的盲视频人脸修复方法——基于新颖的解析引导时序一致Transformer(PGTFormer)。PGTFormer利用语义解析引导选择最优人脸先验,以生成时序一致的无伪影结果。具体而言,我们在高质量视频人脸数据集上预训练时空向量量化自编码器,提取富含上下文信息的表达性先验。随后,时序解析引导码本预测器(TPCP)基于人脸解析上下文线索,在无需人脸预对齐的情况下修复不同姿态的人脸。该策略可减少伪影并缓解因人脸预对齐累积误差导致的抖动。最后,时序保真度调节器(TFR)通过时序特征交互提升保真度,并增强视频时序一致性。在视频人脸上的大量实验表明,我们的方法优于此前的人脸修复基线。代码将发布于\href{https://github.com/kepengxu/PGTFormer}{https://github.com/kepengxu/PGTFormer}。