With the recent advancements in generative modeling, the realism of deepfake content has been increasing at a steady pace, even reaching the point where people often fail to detect manipulated media content online, thus being deceived into various kinds of scams. In this paper, we survey deepfake generation and detection techniques, including the most recent developments in the field, such as diffusion models and Neural Radiance Fields. Our literature review covers all deepfake media types, comprising image, video, audio and multimodal (audio-visual) content. We identify various kinds of deepfakes, according to the procedure used to alter or generate the fake content. We further construct a taxonomy of deepfake generation and detection methods, illustrating the important groups of methods and the domains where these methods are applied. Next, we gather datasets used for deepfake detection and provide updated rankings of the best performing deepfake detectors on the most popular datasets. In addition, we develop a novel multimodal benchmark to evaluate deepfake detectors on out-of-distribution content. The results indicate that state-of-the-art detectors fail to generalize to deepfake content generated by unseen deepfake generators. Finally, we propose future directions to obtain robust and powerful deepfake detectors. Our project page and new benchmark are available at https://github.com/CroitoruAlin/biodeep.
翻译:随着生成建模技术的最新进展,深度伪造内容的真实感持续稳步提升,甚至达到令人难以辨别在线篡改媒体内容的程度,从而导致各类欺诈案件频发。本文系统综述了深度伪造生成与检测技术,涵盖该领域最新进展,包括扩散模型与神经辐射场等方法。文献综述覆盖所有深度伪造媒体类型,包含图像、视频、音频及多模态(视听)内容。我们依据伪造内容生成或篡改的技术流程,对深度伪造进行了多维度分类。进一步构建了深度伪造生成与检测方法的分类体系,阐明核心方法群及其应用领域。随后,我们整合了深度伪造检测数据集,并在主流数据集上提供了最新最优检测器的性能排名。此外,我们开发了新颖的多模态基准测试,用于评估检测器在分布外内容上的性能。实验结果表明,当前最先进的检测器难以泛化至未见过的深度伪造生成器所产生的内容。最后,我们提出了构建鲁棒高效深度伪造检测器的未来研究方向。项目页面及新基准测试发布于 https://github.com/CroitoruAlin/biodeep。