With the development of deep learning technology, various forgery methods emerge endlessly. Meanwhile, methods to detect these fake videos have also achieved excellent performance on some datasets. However, these methods suffer from poor generalization to unknown videos and are inefficient for new forgery methods. To address this challenging problem, we propose UVL, a novel unified video tampering localization framework for synthesizing forgeries. Specifically, UVL extracts common features of synthetic forgeries: boundary artifacts of synthetic edges, unnatural distribution of generated pixels, and noncorrelation between the forgery region and the original. These features are widely present in different types of synthetic forgeries and help improve generalization for detecting unknown videos. Extensive experiments on three types of synthetic forgery: video inpainting, video splicing and DeepFake show that the proposed UVL achieves state-of-the-art performance on various benchmarks and outperforms existing methods by a large margin on cross-dataset.
翻译:随着深度学习技术的发展,各类伪造方法层出不穷。与此同时,检测这些伪造视频的方法也在某些数据集上取得了优异性能。然而,这些方法对未知视频的泛化能力较差,且难以高效应对新型伪造手段。为解决这一具有挑战性的问题,本文提出UVL——一种新颖的统一视频篡改定位框架,专为合成伪造检测而设计。具体而言,UVL提取合成伪造的共性特征:合成边缘的边界伪影、生成像素的非自然分布、以及伪造区域与原始区域之间的非相关性。这些特征广泛存在于不同类型的合成伪造中,有助于提升对未知视频检测的泛化能力。在视频修复、视频拼接和DeepFake三类合成伪造上的大量实验表明,所提出的UVL在多个基准测试中均达到了最先进性能,且在跨数据集场景下显著优于现有方法。