Deep-learning based face-swap videos, also known as deep fakes, are becoming more and more realistic and deceiving. The malicious usage of these face-swap videos has caused wide concerns. The research community has been focusing on the automatic detection of these fake videos, but the as sessment of their visual realism, as perceived by human eyes, is still an unexplored dimension. Visual realism assessment, or VRA, is essential for assessing the potential impact that may be brought by a specific face-swap video, and it is also important as a quality assessment metric to compare different face-swap methods. In this paper, we make a small step to wards this new VRA direction by building a benchmark for evaluating the effectiveness of different automatic VRA models, which range from using traditional hand-crafted features to different kinds of deep-learning features. The evaluations are based on a recent competition dataset named as DFGC 2022, which contains 1400 diverse face-swap videos that are annotated with Mean Opinion Scores (MOS) on visual realism. Comprehensive experiment results using 11 models and 3 protocols are shown and discussed. We demonstrate the feasibility of devising effective VRA models for assessing face-swap videos and methods. The particular usefulness of existing deepfake detection features for VRA is also noted. The code and benchmark will be made publicly available.
翻译:基于深度学习的换脸视频(即深度伪造)正变得越来越逼真且具有欺骗性。这类换脸视频的恶意使用已引发广泛担忧。研究界一直致力于自动检测这些伪造视频,但对其人类视角感知的视觉真实性评估仍是一个未被探索的维度。视觉真实性评估(VRA)对于评估特定换脸视频可能带来的潜在影响至关重要,同时作为比较不同换脸方法的质量评估指标也具有重要意义。本文通过构建基准测试,向这一新的VRA方向迈出了探索性的一步——该基准用于评估不同自动VRA模型的有效性,涵盖从传统手工特征到各类深度学习特征的方法。评估基于近期DFGC 2022竞赛数据集,该数据集包含1400个多样化换脸视频,并标注了视觉真实性的平均意见分数(MOS)。我们展示了使用11个模型和3种协议的综合实验结果并进行了讨论,论证了设计有效VRA模型评估换脸视频与方法的可行性,同时指出了现有深度伪造检测特征对VRA的特殊适用性。相关代码与基准将公开发布。