Deepfakes pose a serious threat to digital well-being by fueling misinformation. As deepfakes get harder to recognize with the naked eye, human users become increasingly reliant on deepfake detection models to decide if a video is real or fake. Currently, models yield a prediction for a video's authenticity, but do not integrate a method for alerting a human user. We introduce a framework for amplifying artifacts in deepfake videos to make them more detectable by people. We propose a novel, semi-supervised Artifact Attention module, which is trained on human responses to create attention maps that highlight video artifacts. These maps make two contributions. First, they improve the performance of our deepfake detection classifier. Second, they allow us to generate novel "Deepfake Caricatures": transformations of the deepfake that exacerbate artifacts to improve human detection. In a user study, we demonstrate that Caricatures greatly increase human detection, across video presentation times and user engagement levels. Overall, we demonstrate the success of a human-centered approach to designing deepfake mitigation methods.
翻译:深度伪造通过助长虚假信息对数字福祉构成严重威胁。随着深度伪造越来越难以用肉眼识别,人类用户愈发依赖深度伪造检测模型来判断视频真伪。目前,模型能输出视频真实性的预测结果,但缺乏提醒人类用户的集成方法。我们提出一种框架,通过放大深度伪造视频中的伪影来增强人类的可检测性。我们设计了一种新颖的半监督伪影注意力模块,该模块基于人类响应进行训练,生成突出视频伪影的注意力图谱。这些图谱具有双重贡献:其一,提升深度伪造检测分类器的性能;其二,使我们能够生成新型"深度伪造漫画"——通过强化深度伪造中的伪影来帮助人类检测的变换结果。用户研究表明,在不同视频呈现时长和用户参与度条件下,漫画形式能显著提升人类检测能力。总体而言,我们验证了以人为中心设计深度伪造缓解策略的有效性。