DETER: Detecting Edited Regions for Deterring Generative Manipulations

Generative AI capabilities have grown substantially in recent years, raising renewed concerns about potential malicious use of generated data, or "deep fakes". However, deep fake datasets have not kept up with generative AI advancements sufficiently to enable the development of deep fake detection technology which can meaningfully alert human users in real-world settings. Existing datasets typically use GAN-based models and introduce spurious correlations by always editing similar face regions. To counteract the shortcomings, we introduce DETER, a large-scale dataset for DETEcting edited image Regions and deterring modern advanced generative manipulations. DETER includes 300,000 images manipulated by four state-of-the-art generators with three editing operations: face swapping (a standard coarse image manipulation), inpainting (a novel manipulation for deep fake datasets), and attribute editing (a subtle fine-grained manipulation). While face swapping and attribute editing are performed on similar face regions such as eyes and nose, the inpainting operation can be performed on random image regions, removing the spurious correlations of previous datasets. Careful image post-processing is performed to ensure deep fakes in DETER look realistic, and human studies confirm that human deep fake detection rate on DETER is 20.4% lower than on other fake datasets. Equipped with the dataset, we conduct extensive experiments and break-down analysis using our rich annotations and improved benchmark protocols, revealing future directions and the next set of challenges in developing reliable regional fake detection models.

翻译：近年来，生成式AI的能力大幅增强，重新引发了人们对生成数据（即“深度伪造”）可能被恶意使用的担忧。然而，深度伪造数据集未能跟上生成式AI的进步，从而无法充分开发出能在现实场景中有意义地警示人类用户的深度伪造检测技术。现有数据集通常基于GAN模型，并通过始终编辑相似的的面部区域引入伪相关。为克服这些缺陷，我们提出了DETER——一个用于检测编辑图像区域并抵御现代先进生成式操控的大规模数据集。DETER包含30万张图像，这些图像由四种最先进的生成器通过三种编辑操作处理：面部交换（一种标准的粗粒度图像操控）、修复（一种用于深度伪造数据集的新型操控）以及属性编辑（一种微妙的细粒度操控）。虽然面部交换和属性编辑在眼睛、鼻子等相似面部区域进行，但修复操作可在随机图像区域执行，从而消除了先前数据集中的伪相关。通过细致的图像后处理确保DETER中的深度伪造图像看起来逼真，人类研究证实，人类在DETER上的深度伪造检测率比其他伪造数据集低20.4%。借助该数据集，我们利用丰富的标注和改进的基准协议进行了广泛的实验和分解分析，揭示了开发可靠区域伪造检测模型的未来方向及下一组挑战。