Recent years have witnessed an exponential increase in the demand for face video compression, and the success of artificial intelligence has expanded the boundaries beyond traditional hybrid video coding. Generative coding approaches have been identified as promising alternatives with reasonable perceptual rate-distortion trade-offs, leveraging the statistical priors of face videos. However, the great diversity of distortion types in spatial and temporal domains, ranging from the traditional hybrid coding frameworks to generative models, present grand challenges in compressed face video quality assessment (VQA). In this paper, we introduce the large-scale Compressed Face Video Quality Assessment (CFVQA) database, which is the first attempt to systematically understand the perceptual quality and diversified compression distortions in face videos. The database contains 3,240 compressed face video clips in multiple compression levels, which are derived from 135 source videos with diversified content using six representative video codecs, including two traditional methods based on hybrid coding frameworks, two end-to-end methods, and two generative methods. In addition, a FAce VideO IntegeRity (FAVOR) index for face video compression was developed to measure the perceptual quality, considering the distinct content characteristics and temporal priors of the face videos. Experimental results exhibit its superior performance on the proposed CFVQA dataset. The benchmark is now made publicly available at: https://github.com/Yixuan423/Compressed-Face-Videos-Quality-Assessment.
翻译:近年来,面部视频压缩需求呈指数级增长,人工智能的成功将传统混合视频编码的边界进一步拓展。生成式编码方法利用面部视频的统计先验知识,在合理的感知率失真权衡下被视为有前景的替代方案。然而,从传统混合编码框架到生成模型,空间域和时间域中失真类型的巨大多样性,为压缩面部视频质量评估带来了巨大挑战。本文提出大规模压缩面部视频质量评估数据库,这是首次系统性地理解面部视频感知质量及多样化压缩失真的尝试。该数据库包含3,240个多压缩等级的面部视频片段,源自135个内容多样化的源视频,采用六种代表性视频编解码器进行压缩,包括两种基于混合编码框架的传统方法、两种端到端方法及两种生成方法。此外,针对面部视频压缩,我们开发了面部视频完整性指数,通过考虑面部视频独特的内容特征与时域先验来衡量感知质量。实验结果表明,该方法在提出的CFVQA数据集上具有优越性能。该基准现已公开,网址为:https://github.com/Yixuan423/Compressed-Face-Videos-Quality-Assessment。